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Web Design & Development 


Web development is a broad term for any activities related to developing a web site for the 
World Wide Web or an intranet. This can include e-commerce business development, web 
design, web content development, client-side/server-side coding, and web server configuration. 
However, among web professionals, "web development" usually refers only to the non-design 
aspects of building web sites, e.g. writing markup and coding. Web development can range from 
developing the simplest static single page of plain text to the most complex web-based internet 
applications, electronic businesses, or social network services. 

Web design is a process of conceptualization, planning, modeling, and execution of electronic 
media delivery via Internet in the form of Markup language suitable for interpretation by Web 
browser and display as Graphical user interface (GUI). 
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Web development 


Web development is a broad term for any activities related to developing a web site for the 
World Wide Web or an intranet. This can include e-commerce business development, web 
design, web content development, client-side/server-side coding, and web server configuration. 
However, among web professionals, "web development" usually refers only to the non-design 
aspects of building web sites, e.g. writing markup and coding. Web development can range from 
developing the simplest static single page of plain text to the most complex web-based internet 
applications, electronic businesses, or social network services. 

For larger businesses and organizations, web development teams can consist of hundreds of 
people. Smaller organizations may only require a single permanent or contracting webmaster, or 
secondary assignment to related job positions such as a graphic designer and/or Information 
systems technician. Web development may be a collaborative effort between departments rather 
than the domain of a designated department. 


Web development as an industry 


Since the mid-1990s, web development has been one of the fastest growing industries in the 
world. In 1995 there were fewer than 10,000 web development companies in the United States 
alone and in 2005 there are over 30,000 such companies. The web development industry is 
expected to grow over 20% by 2010. The growth of this industry is being pushed by large 
businesses wishing to sell products and services to their customers and to automate business 
workflow, as well as the growth of many small web design and development companies. 

In addition, cost of Web site development and hosting has dropped dramatically during this 
time. Instead of costing tens of thousands of dollars, as was the case for early websites, one can 
now develop a simple web site for less than a thousand dollars, depending on the complexity and 
amount of content. Smaller Web site development companies are now able to make web design 
accessible to both smaller companies and individuals further fueling the growth of the web 
development industry. As far as web development tools and platforms are concerned, there are 
many systems available to the public free of charge to aid in development. A popular example is 
the LAMP (Linux, Apache, MySQL, PHP), which is usually distributed free of charge. This fact 
alone has manifested into many people around the globe setting up new Web sites daily and thus 
contributing to increase in web development popularity. Another contributing factor has been the 
rise of easy to use WYSIWYG web development software, most prominently Microsoft 
FrontPage or Adobe Dreamweaver. Using such software, virtually anyone can develop a Web 
page in a matter of minutes. Knowledge of HyperText Markup Language (HTML), or other 
programming languages is not required, but recommended for professional results. 

The next generation of web development tools uses the strong growth in LAMP and 
Microsoft .NET technologies to provide the Web as a way to run applications online. Web 
developers now help to deliver applications as Web services which were traditionally only 
available as applications on a desk based computer. 

Instead of running executable code on a local computer, users are interacting with online 
applications to create new content. This has created new methods in communication and allowed 
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for many opportunities to decentralize information and media distribution. Users are now able to 
interact with applications from many locations, instead of being tied to a specific workstation for 
their application environment. 

Examples of dramatic transformation in communication and commerce led by web 
development include e-commerce. Online auction sites such as eBay have changed the way 
consumers consume and purchase goods and services. Online resellers such as Amazon.com and 
Buy.com (among many, many others) have transformed the shopping and bargain hunting 
experience for many consumers. Another good example of transformative communication led by 
web development is the blog. Web applications such as WordPress and b2evolution have created 
easily implemented blog environments for individual Web sites. Open source content systems 
such as Typo3, Xoops, Joomla, and Drupal have extended web development into new modes of 
interaction and communication. 


Typical Areas 


Web Development can be split into many areas and a typical and basic web development 
hierarchy might consist of; 
e Client Side Coding 
o CSS 
o XHTML (in accordance to modern web design standards, XHTML's use is 
replacing the older HTML) 
o Javascript 
e Server Side Coding 
PHP and MySQL 
ASP and MSSQL 
ASP.NET and MSSQL 
CGI and/or Perl 
Java, e.g. J2EE or WebObjects 
Ruby, e.g. Ruby on Rails 
ColdFusion 
e In practice, many web developers will also have interdiscipinary skills / roles, 
including: 
o Graphic design / web design 
o Information architecture and copywriting/copyediting with web usability, 
accessibility and search engine optimisation in mind 
o Project management, QA and other aspects common to IT development in 
general 
The above list is a simple website development hierarchy and can be extended to include all 
client side and server side aspects. It is still important to remember that web development is 
generally split up into client side coding covering aspects such as the layout and design, then 
server side coding, which covers the website's functionality and back end systems. 
Looking at these items from an "umbrella approach", client side coding such as XHTML is 
executed and stored on a local client (in a web browser) whereas server side code is not available 
to a client and is executed on a web server which generates the appropriate XHTML which is 
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then sent to the client. As the nature of client side coding allows you to alter the HTML on a 
local client and refresh the pages with updated content (locally), web designers must bear in 
mind the importance and relavence to security with their server side scripts. If a server side script 
accepts content from a locally modified client side script, the web development of that page 
shows poor sanitization with relation to security. 


Security 


Web development takes into account a lot of things, such as data entry error checking 
through forms, as well as sanitization of the data that is entered in those fields. Malicious 
practices such as SQL injection can be executed through users with ill intent yet only primitive 
knowledge of web development as a whole. Not only this, but scripts can be exploited to grant 
unauthorized access to the hacker to gain information such as email addresses, passwords and 
protected content. 

Some of this is dependent on the server environment (most commonly Apache or Microsoft 
IIS) on which the scripting language, such as PHP, Ruby or ASP is running, and therefore is not 
necessarily down to the web developer themselves to maintain. However, stringent testing of 
web applications before public release is encouraged to prevent such exploits from occurring. 


Mobile Web 


The Mobile Web refers to the World Wide Web as accessed from mobile devices such as 
cell phones, PDAs, and other portable gadgets connected to a public network. Access does not 
require a desktop computer. 

Today, many more people have access to mobile devices than a desktop computer. 

However, Mobile Web access today still suffers from interoperability and usability problems. 
This is partly due to the small physical size of the screens of mobile devices and partly due to the 
incompatibility of many mobile devices with not only computer operating systems, but also the 
format of much of the information available on the Internet. 


Standards 


The development of standards is one approach being implemented to improve the 
interoperability, usability, and accessibility issues surrounding mobile web usage. 

The W3C Mobile Web Initiative is a new initiative set up by the W3C to develop best 
practices and technologies relevant to the Mobile Web. The goal of the initiative is to make 
browsing the Web from mobile devices more reliable and accessible. The main aim is to evolve 
standards of data formats from Internet providers that are tailored to the specifications of 
particular mobile devices. The W3C has published guidelines (Best Practices, Best Practices 
Checker Software Tool) for mobile content, and is actively addressing the problem of device 
diversity by establishing a technology to support a repository of Device Descriptions. 

W3C is also developing a validating schema to assess the readiness of content for the mobile 
web, through its mobileOK Scheme, which will help content developers to quickly determine if 
their content is web-ready. The W3C guidelines and mobile OK approach have not been immune 
from criticism and an alternative set of guidelines has been made available. This puts the 
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emphasis on Adaptation, which is now seen as the key process in achieving the Ubiquitous Web, 
when combined with a Device Description Repository. An alternative approach is to adopt a 
Multi-Web Practice whereby for a given theme a set of URIs for different devices are developed 
with each URI having content appropriate to its designated device. A bookmark for this set of 
URIs held in an array is known as an AGI (Array of Graphic Identifiers) 

mTLD, the registry for .mobi, has released a free testing tool called the MobiReady Report to 
analyze the mobile readiness of website. It does a free page analysis and gives a Mobi Ready 
score. This report tests the mobile-readiness of the site using industry best practices & standards. 

Other standards for the mobile web are being documented and explored for particular 
applications by interested industry groups, such as the use of the mobile web for the purpose of 
education and training e.g. Standards for M-Learning Project. 


Development 


The Mobile Web primarily utilises lightweight pages written in Extensible Hypertext Markup 
Language (XHTML) or Wireless Markup Language (WML) to deliver content to mobile 
devices. 

New tools such as Macromedia's Flash Lite or Sun's J2ME enable the production of user 
interfaces customized for mobile devices. In any case, with the increasing movement away from 
website-based content towards delivery via RSS, Atom and other formats in which content is 
divorced from presentation, the issue of microcontent becomes less of a problem as the device 
rather than the content-provider is enabled to specify how the content is displayed. 


Top-level Domain 


The .mobi sponsored top-level domain was launched specifically for the mobile internet by a 
consortium of companies including Google, Microsoft, Nokia, Samsung, and Vodafone. By 
forcing sites to comply with mobile web standards, .mobi ensures visitors a consistent and 
optimized experience on their mobile device. 


Mobile Web 2.0 


An example Web 2.0 technology used on the mobile web is the blog, resulting in the term 
moblog. Critics point to the difficulties of transferring Web 2.0 concepts such as open standards 
to the mobile web. On the other hand, advocates present it as a means of pushing information up 
onto the web in addition to bringing information down to the user.[1] 


Faxing via Mobile Web 


With the advancement of internet faxing, faxes are being sent online. Furthermore, they can 
be sent and received through Personal Digital Assistants (PDAs). 


Advertising on the Mobile Web 
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Advertisers are increasingly using the mobile Web as platform to reach consumers. A recent 
study by the Online Publishers Association [2] reports that about one-in-ten mobile Web users 
said they have made a purchase based on a mobile Web ad, while 23% said they have visited a 
Web site, 13% said they have requested more information about a product or service and 11% 
said they have gone to a store to check out a product. 


Web developer 


A web developer is a software developer or software engineer who is specifically engaged in 
the development of World Wide Web applications, or distributed network applications that are 
run over the HTTP protocol using an HTTP server, a web server, and an HTTP client, a web 
browser. Web developers can be webmasters who handle website administration and web design 
as well as web application development or can simply be web application developers. Web 
developers are formally in charge of web development within their respective organizations or 
also as freelance software developers. 


Tools of the trade 


As software developers for the Web, web developers use: 

e Interpreted scripting languages like Python, Perl, PHP, Ruby, VBScript, JScript, 
etc. as well as some compiled languages like Java,C,C++,C#,VB.Net that are 
either pre-compiled or just-in-time compiled to create dynamic and interactive 
server-side web applications. 

e Client-side interpreted scripting languages such as JavaScript (also called 
ECMAScript), HTML markup, Cascading Style Sheets, and the web browser 
DOM, to build client-side web applications. Client-side Web Development can 
also include the utilisation of so-called Rich Client technologies such as Flash, 
Java, Curl, XUL or others to assemble web applications that in behaviour and 
function more closely resemble classic desktop applications. 

e Relational Database Management Systems (RDBMS) like MySQL, Microsoft SQL 
Server, Oracle database, PostgreSQL, etc. as well as the self-describing markup 
language XML (Extensible Markup Language), to handle data collection from 
web users as well as storing either public or private data that is shared over 
networks via the Web. 


Setting a standard 


Web developers are the gatekeepers of the Web. They are supposed to adhere to the open 
standards and guidelines created by the W3C (World Wide Web Consortium) when creating any 
programming for the web. Often, web developers help contribute to these open standards and 
guidelines through their work on open source projects working to help enhance and debug web- 
based technologies. 

However, because of the fairly low barrier to entry -- freely available development 
environments (web server environments and development languages), freely available tutorials 
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and information on how to do web development -- more often than not, novice Web Developers 
do not "adhere to the open standards and guidelines". Additionally, poorly designed or 
proprietary software tools that don't follow the "open standards and guidelines" create ad hoc and 
de facto standards which, unfortunately must be followed in order to "make things work". This 
was especially true during the "Browser Wars" of the 1990s. It is becoming less true as more and 
better tools enter the marketplace. 


Common Misconceptions 


As is often misunderstood by the layperson, a Web Developer does not always create 
graphics, logos, or identity, or create written, video, or audio content for a website, however 
some do. Web Designers, Web Copy Editors and Web Content Creators are different from Web 
Developers. 


See also 


e Web design 


Web directory 


A web directory or link directory is a directory on the World Wide Web. It specializes in 
linking to other web sites and categorizing those links. 

A web directory is not a search engine, and does not display lists of web pages based on 
keywords, instead it lists web sites by category and subcategory. The categorization is usually 
based on the whole web site, rather than one page or a set of keywords, and sites are often 
limited to inclusion in only one or two categories. Web directories often allow site owners to 
directly submit their site for inclusion, and have editors review submissions for fitness. 

RSS directories are similar to web directories, but contain collections of RSS feeds, instead 
of links to web sites. 


General 


Some directories are very general in scope and list websites across a wide range of 
categories, regions and languages. But there are also a large number of niche directories, which 
focus on restricted regions, single languages, or specialist sectors. 

Examples of well known, general, web directories are Yahoo! Directory and the Open 
Directory Project (ODP). ODP is significant due to its extensive categorization and large number 
of listings and its free availability for use by other directories and search engines (many sites 
violate its terms of use by using its content without acknowledgement). 

A debate over the quality of directories and databases continues, as search engines use ODP's 
content without real integration, and some experiment using clustering. There have been many 
attempts to make directory development easier, whether using a "links for all" type link 
submission site using a script, or any number of available PHP portals and programs. Recently, 
social software techniques have spawned new efforts of categorization, with Amazon.com 
adding tagging to their product pages. 
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Directories have various types of listings, often dependent upon the price paid for inclusion: 

e Free Submission - there is no charge for review of the site 

e Reciprocal Link - the site submitted must link back to the directory in order to 
be listed 

e Paid Submissions - a fee is charged for reviewing the submitted link 

e No Follow - there is a rel="nofollow" attribute associated with the link, meaning 
search engines will not follow the link. 

e Featured Link - the link is given a premium position in the category where it is 


submitted 

e Featured Homepage Link - the link may be listed on the homepage of the 
directory. 

e Bid for Position - a recent innovation (2007) where sites are ordered based on 
bids 


Human-edited directories 


A human-edited directory is created and maintained by editors who add links based on the 
policies particular to that directory. 

Human-edited directories are often targeted by SEOs on the basis that links from reputable 
sources will improve rankings in the major search engines. Some directories may prevent search 
engines from rating a displayed link by using redirects, nofollow attributes, or other techniques. 

Many human-edited directories, including the Open Directory Project and the World Wide 
Web Virtual Library, are edited by volunteers, who are often experts in particular categories. 
These directories are sometimes criticized due to long delays in approving submissions, or for 
rigid organizational structures and disputes among volunteer editors. 

In response to these criticisms, some volunteer-edited directories have adopted wiki 
technology, to allow broader community participation in editing the directory (at the risk of 
introducing lower-quality, less objective entries). 

Another direction taken by some web directories is the paid for inclusion model. This method 
enables the directory to offer timely inclusion for submissions and generally fewer listings as a 
result of the paid model. They often offer additional listing options to further enhance listings, 
including features listings and additional links to inner pages of the listed web site. These options 
typically have an additional fee associated, but offer significant help and visibility to sites and/or 
their inside pages. 

Today submission of websites to web directories is considered as a common SEO (search 
engine optimization) technique to get vital back-links for the submitted web site. One distinctive 
feature of ‘directory submission’ is that it can not be fully automated like search engine 
submissions. Manual directory submission is a tedious and time consuming job and is often 
outsourced by the webmasters. 


Web engineering 


The World Wide Web has become a major delivery platform for a variety of complex and 
sophisticated enterprise applications in several domains. In addition to their inherent multifaceted 
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functionality, these Web applications exhibit complex behavior and place some unique demands 
on their usability, performance, security and ability to grow and evolve. 

However, a vast majority of these applications continue to be developed in an ad-hoc way, 
contributing to problems of usability, maintainability, quality and reliability [1-4]. While Web 
development can benefit from established practices from other related disciplines, it has certain 
distinguishing characteristics that demand special considerations. In the recent years, there have 
been some developments towards addressing these problems and requirements. As an emerging 
discipline, Web engineering actively promotes systematic, disciplined and quantifiable 
approaches towards successful development of high-quality, ubiquitously usable Web-based 
systems and applications [1,2]. 

In particular, Web engineering focuses on the methodologies, techniques and tools that are 
the foundation of Web application development and which support their design, development, 
evolution, and evaluation. Web application development has certain characteristics that make it 
different from traditional software, information system, or computer application development. 

Web engineering is multidisciplinary and encompasses contributions from diverse areas: 
systems analysis and design, software engineering, hypermedia/hypertext engineering, 
requirements engineering, human-computer interaction, user interface, information engineering, 
information indexing and retrieval, testing, modelling and simulation, project management, and 
graphic design and presentation. 

Web engineering is neither a clone, nor a subset of software engineering, although both 
involve programming and software development. While Web Engineering uses software 
engineering principles, it encompasses new approaches, methodologies, tools, techniques, and 
guidelines to meet the unique requirements of Web-based applications. 

For an introduction to Web engineering, see "Web Engineering: Introduction and 
Perspectives" by San Murugesan and Athula Ginige, Chapter | in "Web Engineering: Principles 
and Techniques" (Suh, W. ed.), Idea Group Publishing, 2005 


Web engineering as a discipline 


Proponents of web engineering supported the establishment of web engineering as a 
discipline at an early stage of web. First Workshop on Web Engineering was held in conjunction 
with World Wide Web Conference held in Brisbane, Australia, in 1998. San Murugesan, Yogesh 
Deshpande, Steve Hansen and Athula Ginige, from University of Western Sydney, Australia 
formally promoted web engineering a new discipline in the first ICSE workshop on Web 
Engineering in 1999 [1]. Since then they published a serial of papers in a number of journals, 
conferences and magazines to promote their view and got wide support. Major arguments for 
web engineering as a new discipline are: 

e WIS (Web Information System) and WIS development process are different and 
unique [2, 7]. 

e Web engineering is multi-disciplinary; no single discipline (such as software 
engineering) can provide complete theory basis, body of knowledge and 
practices to guide WIS development [5]. 

e Issues of evolution and lifecycle management when compared to more 
‘traditional’ applications. 
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Web based information systems and applications are pervasive and non-trivial. 
The prospect of web as a platform will continue to grow and it is worth being 


treated specifically. 


and independent web engineering is, compared with other disciplines. 


Main topics of Web engineering include, but are not limited to, the following areas: 


Web Process & Project Management Disciplines 


Development Process and Process Improvement of Web Applications 


Web Project Management and Risk Management 
Collaborative Web Development 


Web Requirements Modeling Disciplines 


Business Processes for Applications on the Web 
Process Modelling of Web applications 
Requirements Engineering for Web applications 


Web System Design Disciplines, Tools & Methods 


UML and the Web 

Conceptual Modeling of Web Applications (aka. Web modeling) 
Prototyping Methods and Tools 

Web design methods 

CASE Tools for Web Applications 

Web Interface Design 

Data Models for Web Information Systems 


Web System Implementation Disciplines 


Integrated Web Application Development Environments 
Code Generation for Web Applications 

Software Factories for/on the Web 

Web 2.0, AJAX, E4X and Other New Developments 

Web Services Development and Deployment 

Empirical Web Engineering 


Web System Testing Disciplines 


Testing and Evaluation of Web systems and Applications 
Testing Automation, Methods and Tools for Web Applications 
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Web Applications Categories Disciplines 


e Semantic Web applications 

e Ubiquitous and Mobile Web Applications 

e Mobile Web Application Development 

e Device Independent Web Delivery 

e Localization and Internalization Of Web Applications 


Web Quality Attributes Disciplines 


e Web Metrics, Cost Estimation, and Measurement 

e Personalisation and Adaptation of Web applications 
e Web Quality 

e Usability of Web Applications 

e Web accessibility 

e Performance of Web-based applications 


Content-related Disciplines 


e Web Content Management 
e Multimedia Authoring Tools and Software 


Web Engineering Education 
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Web Engineering Resources 


Organizations 


Web Engineering Community 

WISE Society 

ACM SIGWEB: 

World Wide Web Consortium: http://www.w3.org 


"Web Engineering - The Discipline of Systematic Development of Web 
Applications", edited by Gerti Kappel, Birgit Proll, Siegfried Reich, and Werner 
Retschitzegger, John Wiley & Sons, 2006 

"Web Engineering", edited by Emilia Mendes and Nile Mosley, Springer-Verlag, 
2005 

"Web Engineering: Principles and Techniques", edited by Woojong Suh, Idea 
Group Publishing, 2005 

"Building Web Applications with UML" (2nd edition), by Jim Conallen, Pearson 
Education, 2003 

"Information Architecture for the World Wide Web" (2nd edition), by Peter 
Morville and Louis Rosenfeld, O'Reilly, 2002 

"Web Site Engineering: Beyond Web Page Design", by Thomas A. Powell, David 
L. Jones and Dominique C. Cutts, Prentice Hall, 1998 


Conferences 


World Wide Web Conference (by IW3C2, since 1994): http://www.iw3c2.org 
International Conference on Web Engineering (ICWE) (since 2000) 

o 2007: http://www.icwe2007.org/ 

o 2006: http://www.icwe2006.org 

o 2005: http://www.icwe2005.org 

o 2004: http://www.icwe2004.org 

ICWE Conference Proceedings 

o ICWE2005: LNCS 3579 

o ICWE2004: LNCS Vol. 3140 

o ICWE2003: LNCS Vol. 2722 

Web Information Systems Engineering Conference (by WISE Society, since 
2000) 

International Conference on Web Information Systems and Technologies 
(WebIST) (since 2005) 

International Workshop on Web Site Evolution (WSE 

International Conference on Software Engineering 


Book Chapters and Articles 
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e Murugesan,S and A.Ginige, A. "Web Engineering: Introduction and 
Perspectives", Chapter 1 in "Web Engineering: Principles and Techniques" (Suh, 
W. ed.), Idea Group Publishing, 2005. 

e Pressman, R.S., ‘Applying Web Engineering’, Part 3, Chapters 16-20, in Software 
Engineering: A Practitioner's Perspective, Sixth Edition, McGraw-Hill, New York, 
2004 

Journals 

e Journal of Web Engineering 

e International Journal of Web Engineering and Technology 

e ACM Transactions on Internet Technology 

e World Wide Web (Kluwer Academic PublishersSpecial issues 

e Web Engineering, IEEE MultiMedia, Jan.-Mar. 2001 (Part 1) and April-June 
2001 (Part 2). 

e Usability Engineering, IEEE Software, January-February 2001. 

e Web Engineering, Cutter IT Journal, 14(7), July 2001.* 

e Testing E-business Applications, Cutter IT Journal, September 2001. 

e Engineering Internet Software, IEEE Software, March-April 2002. 

e Usability and the Web, IEEE Internet Computing, March-April 2002. 


See also 


Web modeling 


Webby Awards 


ebby Awards logo 
Awarded for 

Excellence in websites 
Presented by 

International Academy of Digital Arts and Sciences 
First awarded 

1994 (original version) 

1996 (current incarnation) 


Presented by The International Academy of Digital Arts and Sciences, The Webby Awards 
are a set of awards presented to the "world's best websites". The awards have been given out 
since 1996. There is also a second set of awards called the People's Voice Awards for the same 
categories which are given by popular vote. 
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History 


The first Webby Awards were given out from 1994 to 1996 by the World Wide Web 
Organization an organization that was dedicated to the promotion of the Web. It was largely 
decommissioned sometime after July, 1997. The organization was chiefly sponsored by 
WebMagic, the Internet business developer that has built many well known online businesses, 
including Toys.com (which became part of eToys), and Pets.com. Additional sponsorship was 
provided by Cisco Systems and ADC Telecommunications. 

The second, and current, Webby Awards were founded in 1996 by Tiffany Shlain, and 
sponsored by The Web magazine. When The Web ceased to exist in 1998, the International 
Academy of Digital Arts and Sciences (IADAS) was formed to take over. 

The organizations that created the first and second Webby Awards were not related to one 
another. It is not known if Tiffany Shlain was inspired by the first Webby Awards or if she came 
up with the concept independently. 


Awards granted 


The Webby Awards and The People's Voice Awards have been awarded each spring (since 
1996), and The Webby Business Awards and Honorable Mentions each fall (since 2003). 

The Webby Awards require a $250 fee from all websites which they nominate, which some 
blogs see as an unnecessary burden. 


The Webby Awards categories 


In each category, two awards are handed out: a Webby Award selected by a panel of judges, 
and a People's Voice Award selected by the votes of visitors to the The Webby Awards site. 
e Activism: Sites facilitating political change, social movement, rights, public 
education, or revolution. 
Best Practices: Sites demonstrating unparalleled excellence in six areas: 
Content 
Structure & Navigation 
Visual Design 
Interactivity 
Functionality 
Overall Experience 
Broadband: Sites fully integrating bandwidth-intensive interactive content. 
Commerce: Sites dedicated to selling goods and products online, or using a 
‘particularly innovative’ take on e-commerce. 
e Community: Sites creating/facilitating online community, connectedness and 
communication around shared interests. 
e Education: Sites promoting education or providing learning materials, for 
children or adults. 
e Fashion: Sites dedicated to personal style, clothing, design, and accessories. 


co DMAWNE CO 
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Film: Sites dedicated to preservation, celebration and presentation of film and 
film culture. 

Finance: Sites providing financial services and/or information. 

Games: Sites dedicated to (the culture of) computer gaming. 

Government & Law: Sites dedicated to legal and government matters. 

Health: Sites dedicated to improving personal health and well-being. 

Humor: Sites dedicated to comedy or the art of laughter. 

Kids: Sites created for "the under-18 set". 

Lifetime Achievement: An honorary award granted to individuals who have 
been integral to the growth of the Web. 

Living: Sites providing information and resources that allow people to live on 
their own terms. 

Music: Sites dedicated to the distribution and presentation of music or its 
culture. 

NetArt: Sites that integrate art and technology, exploring and expanding the 
distinctive capacity of the medium. 

News: Sites dedicated to announcing and tracking recent goings-on. 

Personal Web Site: Sites created by an individual about themselves, or sites 
dedicated to an individual. 

Politics: Sites providing political information. 

Print & Zines: Sites dedicated to the written word, including online versions of 
print publications. 

Radio: Sites with ties to a radio network, station, or program. 

Rising Star: Given to the site with the greatest percent increase in visitors from 
April to May of the year in question, according to Nielsen//NetRatings data. 
Sites must meet a minimum cutoff of 500K unique visitors for April and May to 
be considered. 

Science: Sites dedicated to scientific exploration and news. 

Services: Sites that enable real world activities online. 

Spirituality: Sites dedicated to the spirit or the soul, or on any Divine power. 
Sports: Sites dedicated to athletics. 

Technical Achievement: Sites implementing technology that pushes the 
envelope, inviting visitors to believe in what once seemed impossible. 

Top Global Properties (Top U.S. Properties): Given to the site that draws the 
most visitors in the 14 countries that Nielsen//NetRatings measure (the U.S.). 
Travel: Sites dedicated to travel services and information. 

TV: Sites which are online counterparts of television programs or networks. 
Weird: Sites so forward thinking they seem strange when viewed without the 
future in mind. 

Youth: (see Kids) 


The Webby Business Awards categories 


Automotive 
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e Creative Services 

e Financial Services 

e Food & Beverage 

e "Good Deeds" 

e Healthcare/Pharmaceuticals 
e Information Technology 

e Media/Entertainment 

e Marketing & Communications 
e Professional Services 

e Retail 

e Technical Services 

e Telecommunications 

e Travel/Tourism 


Honorable Mention Categories 


e Branding & Design 

e Cost-Cutting/Operational Efficiency 

e Customer Relationship Management/Customer Loyalty & Retention 
e Integration of Online/Offline Experience 

e Product Customization/Personalization 

e Sales Lead Generation/Increased Sales 


Acceptance Speeches 


The Webbys are famous for limiting recipients to five word speeches, which are often 
humorous. For example, in 2006 Cute Overload's was "not bad for posting kittens" and 
TripAdvisor's was "because some hotels really suck". At the 2007 awards, David Bowie's speech 
was "I only get five words? Shit, that was five. Four more there. That's three. Two." 


Criticism 


San Francisco blog SFist has charged that the Webby Awards are essentially a Who's Who 
scam, as they require a "processing fee" be paid for award consideration.[8] 


World Wide Web Consortium 


Type 


Consortium 


Founded 
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October 1994 


Founder 


Tim Berners-Lee 


Headquarters 


MIT/CSAIL in USA 


ERCIM in France 


Keio University in Japan 


and many other offices around the world 


Website 


www.w3.org 


The World Wide Web Consortium (W3C) is the main international standards organization 
for the World Wide Web (W3). It is arranged as a consortium where member organizations 
maintain full-time staff for the purpose of working together in the development of standards for 
the W3. As of March 2007, the W3C had 441 members. It is always open for new organizations 
to join. 

W3C also engages in education and outreach, develops software and serves as an open forum 
for discussion about the Web. 

The Consortium is headed by Sir Tim Berners-Lee, the primary author of the original URL 
(Uniform Resource Locator), HTTP (HyperText Transfer Protocol) and HTML (HyperText 
Markup Language) specifications, the principal technologies that form the basis of the World 
Wide Web. 


History 


In October 1994, Tim Berners-Lee left the European Organization for Nuclear Research 
(CERN) and founded the World Wide Web Consortium (W3C) at the Massachusetts Institute of 
Technology Laboratory for Computer Science (MIT/LCS) with support from the Defense 
Advanced Research Projects Agency (DARPA) -- which had pioneered the Internet -- and the 
European Commission. 
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The consortium was created to ensure compatibility and agreement among industry members 
in the adoption of new standards. Prior to its creation, incompatible versions of HTML were 
offered by different vendors, increasing the potential for inconsistency between web pages. The 
consortium was created to get all those vendors to agree on a set of core principles and 
components which would be supported by everyone. 

It was originally intended that CERN host the European branch of W3C. However, CERN 
wished to focus on particle physics, not information technology. In April 1995 the Institut 
national de recherche en informatique et en automatique (INRIA) became the European host of 
W3C, with Keio University becoming the Japanese branch in September 1996. Starting in 1997, 
W3C created regional offices around the world; as of May 2006 it has sixteen World Offices 
covering Australia, the Benelux countries (the Netherlands, Luxemburg, and Belgium), Mainland 
China, Finland, Germany and Austria, Greece, Hong Kong, Hungary, India, Ireland, Israel, Italy, 
Japan, South Korea, Korea, Morocco, Spain, Sweden, the United Kingdom and the United 
States. 

In January 2003, the European host was transferred from INRIA to the European Research 
Consortium for Informatics and Mathematics (ERCIM), an organization that represents 
European national computer science laboratories. 


Recommendations and certifications 


In accord with the W3C Process Document, a Recommendation progresses through five 

maturity levels: 

Working Draft (WD) 

Last Call Working Draft 

Candidate Recommendation (CR) 

Proposed Recommendation (PR) 
5. W3C Recommendation (REC) 

A Recommendation may be updated by separately published Errata until enough substantial 
edits accumulate, at which time a new edition of the Recommendation may be produced (e.g., 
XML is now in its fourth edition). W3C also publishes various kinds of informative Notes which 
are not intended to be treated as standards. 

The Consortium leaves it up to manufacturers to follow the Recommendations. Many of its 
standards define levels of conformance, which the developers must follow if they wish to label 
their product W3C-compliant. Like any standards of other organizations, W3C recommendations 
are sometimes implemented partially. The Recommendations are under a royalty-free patent 
license, allowing anyone to implement them. 

Unlike the ISOC and other international standards bodies, the W3C does not have a 
certification program. A certification program is a process which has benefits and drawbacks; the 
W3C has decided, for now, that it is not suitable to start such a program owing to the risk of 
creating more drawbacks for the community than benefits. 


KRWNS 


Administration 


The Consortium is jointly administered by the MIT Computer Science and Artificial 
Intelligence Laboratory (CSAIL) in the USA, the European Research Consortium for Informatics 
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and Mathematics (ERCIM) (in Sophia Antipolis, France), and Keio University (in Japan). The 
W3C also has World Offices in fifteen regions around the world. The W3C Offices work with 
their regional Web communities to promote W3C technologies in local languages, broaden 
W3C's geographical base, and encourage international participation in W3C Activities. 


Standards 
W3C/IETF Standards (over Internet protocol suite): 
e WSDL 
: nee XACML 
XHTML 
DOM 
XML 
HTML 
XML Events 
RDF 
XForms 
hile XML Information 
SOAP Set 
aha XML Schema 
SRGS 
XPath 
SSML XQuery 
VoiceXML XSLT 
See also 
e Mobile Web 
e Web server 
e Web services 
e Web template 
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e 
Ajax 

Ajax, or AJAX, is a web development technique used for creating interactive web 
applications. The intent is to make web pages feel more responsive by exchanging small amounts 
of data with the server behind the scenes, so that the entire web page does not have to be 
reloaded each time the user requests a change. This is intended to increase the web page's 
interactivity, speed, functionality, and usability. 

The name is shorthand for Asynchronous JavaScript and XML. Ajax is asynchronous in that 
loading does not interfere with normal page loading. JavaScript is the programming language in 
which Ajax function calls are made. Data retrieved using the technique is commonly formatted 
using XML, as reflected in the naming of the XMLHttpRequest object from which Ajax is 
derived. 

Ajax is a cross-platform technique usable on many different operating systems, computer 
architectures, and Web browsers as it is based on open standards such as JavaScript and XML, 
together with open source implementations of other required technologies. 


Constituent technologies 


Ajax uses a combination of: 

e XHTML (or HTML) and CSS, for marking up and styling information. 

e The DOM accessed with a client-side scripting language, especially ECMAScript 
implementations such as JavaScript and JScript, to dynamically display and 
interact with the information presented. 

e The XMLHttpRequest object is used to exchange data asynchronously with the 
web server. In some Ajax frameworks and in certain situations, an IFrame 
object is used instead of the XMLHttpRequest object to exchange data with the 
web server, and in other implementations, dynamically added <script> tags may 
be used. 

e XML is sometimes used as the format for transferring data between the server 
and client, although any format will work, including preformatted HTML, plain 
text and JSON. These files may be created dynamically by some form of server- 
side scripting. 

Like DHTML, LAMP, and SPA, Ajax is not a technology in itself, but a term that refers to 
the use of a group of technologies. 

The "core" and defining element of Ajax is the XMLHttpRequest object, which gives 
browsers the ability to make dynamic and asynchronous data requests without having to unload 
and reload a page. Given XMLHttpRequest can eliminate the need for page refreshes, other 
technologies have become more prominently used and highlighted with this development 
approach. 

Besides XMLHttpRequest, the use of DOM, CSS, and JavaScript provides a more-enhanced 
"single-page" experience. 


History 
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The first use of the term in public was by Jesse James Garrett in February 2005.'! Garrett 
thought of the term when he realized the need for a shorthand term to represent the suite of 
technologies he was proposing to a client. 

Although the term Ajax was coined in 2005, most of the technologies that enable Ajax started 
a decade earlier with Microsoft's initiatives in developing Remote Scripting. Referring to the 
idea as Inner-Browsing, Netscape Evangelism published an article in 2003 which presented ideas 
for implementing models in which "all navigation occurs within a single page, as in a typical 
application interface."[2] Techniques for the asynchronous loading of content on an existing 
Web page without requiring a full reload date back as far as the IFRAME element type 
(introduced in Internet Explorer 3 in 1996) and the LAYER element type (introduced in 
Netscape 4 in 1997, abandoned during early development of Mozilla). Both element types had a 
src attribute that could take any external URL, and by loading a page containing JavaScript that 
manipulated the parent page, Ajax-like effects could be attained. This set of client-side 
technologies was usually grouped together under the generic term of DHTML. Macromedia's 
Flash could also, from version 4, load XML and CSV files from a remote server without 
requiring a browser to be refreshed. 

Microsoft's Remote Scripting (MSRS), introduced in 1998, acted as a more elegant 
replacement for these techniques, with data being pulled in by a Java applet with which the client 
side could communicate using JavaScript. This technique worked on both Internet Explorer 
version 4 and Netscape Navigator version 4 onwards. Microsoft then created the 
XMLHttpRequest object in Internet Explorer version 5 and first took advantage of these 
techniques using XMLHttpRequest in Outlook Web Access supplied with the Microsoft 
Exchange Server 2000 release. 

The Web development community, first collaborating via the 
microsoft.public.scripting.remote newsgroup and later through blog aggregation, subsequently 
developed a range of techniques for remote scripting to enable consistent results across different 
browsers. In 2002, a user-community modification! to Microsoft Remote Scripting was made to 
replace the Java applet with XMLHttpRequest. 

Remote Scripting Frameworks such as ARSCIF"! surfaced in 2003 not long before Microsoft 
introduced Callbacks in ASP.NET."! 

In addition, the World Wide Web Consortium has several Recommendations that also allow 
for dynamic communication between a server and user agent, though few of them are well 
supported. These would include: 

e The object element defined in HTML 4 for embedding arbitrary content types 
into documents, (replaces inline frames under XHTML 1.1) 
e The Document Object Model (DOM) Level 3 Load and Save Specification [1] 


Justification 


The core justification for Ajax style programming is to overcome the page loading 
requirements of HTML/HTTP-mediated web pages. Ajax creates the necessary initial conditions 
for the evolution of complex, intuitive, dynamic, data-centric user interfaces in web pages—the 
realization of that goal is still a work in progress. 
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Web pages, unlike native applications, are loosely coupled, meaning that the data they 
display are not tightly bound to data sources and must be first marshaled (set out in proper order) 
into an HTML page format before they can be presented to a user agent on the client machine. 
For this reason, web pages have to be re-loaded each time a user needs to view different datasets. 
By using the XMLHttpRequest object to request and return data without a re-load, a programmer 
by-passes this requirement and makes the loosely coupled web page behave much like a tightly 
coupled application, but with a more variable lag time for the data to pass through a longer 
"wire" to the remote web browser. 

For example, in a classic desktop application, a programmer has the choice of populating a 
tree view control with all the data needed when the form initially loads, or with just the top-most 
level of data—which would load more quickly, especially when the dataset is very large. In the 
second case, the application would fetch additional data into the tree control depending on which 
item the user selects. This functionality is difficult to achieve in a web page without Ajax. To 
update the tree based on a user's selection would require the entire page to re-load, leading to a 
very jerky, non-intuitive feel for the web user who is browsing the data in the tree. 


Advantages of Ajax 
Bandwidth usage 


By generating the HTML locally within the browser, and only bringing down JavaScript calls 
and the actual data, Ajax web pages can appear to load relatively quickly since the payload 
coming down is much smaller in size. An example of this technique is a large result set where 
multiple pages of data exist. With Ajax, the HTML of the page (e.g., a table structure with 
related TD and TR tags) can be produced locally in the browser and not brought down with the 
first page of the document. 

In addition to "load on demand" of contents, some web-based applications load stubs of 
event handlers and then load the functions on the fly. This technique significantly cuts down the 
bandwidth consumption for web applications that have complex logic and functionality. 


Separation of data, format, style, and function 


A less specific benefit of the Ajax approach is that it tends to encourage programmers to 
clearly separate the methods and formats used for the different aspects of information delivery 
via the web. Although Ajax can appear to be a jumble of languages and techniques, and 
programmers are free to adopt and adapt whatever works for them, they are generally propelled 
by the development motive itself to adopt separation among the following: 

1. Raw data or content to be delivered, which is normally embedded in XML and 
sometimes derived from a server-side database. 

2. Format or structure of the webpage, which is almost always built in HTML or 
XHTML and is then reflected and made available to dynamic manipulation in the 
DOM. 

3. Style elements of the webpage: everything from fonts to picture placement are 
derived by reference to embedded or referenced CSS. 
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4. Functionality of the webpage, which is provided by a combination of: 
1. Javascript on the client browser (also called DHTML), 
2. Standard HTTP and XMLHttp or client-to-server communication, and 
3. Server-side scripting and/or programs using any suitable language 
preferred by the programmer to receive the client's specific requests and 
respond appropriately. 


Disadvantages 
Browser integration 


The dynamically created page does not register itself with the browser history engine, so 
triggering the "Back" function of the users' browser might not bring the desired result. 

Developers have implemented various solutions to this problem. These solutions can involve 
using invisible IFRAMEs to invoke changes that populate the history used by a browser's back 
button. Google Maps, for example, performs searches in an invisible IFRAME and then pulls 
results back into an element on the visible web page. The World Wide Web Consortium (W3C) 
did not include an iframe element in its XHTML 1.1 Recommendation; the Consortium 
recommends the object element instead. 

Another issue is that dynamic web page updates make it difficult for a user to bookmark a 
particular state of the application. Solutions to this problem exist, many of which use the URL 
fragment identifier (the portion of a URL after the '#') to keep track of, and allow users to return 
to, the application in a given state. This is possible because many browsers allow JavaScript to 
update the fragment identifier of the URL dynamically, so that Ajax applications can maintain it 
as the user changes the application's state. This solution also improves back-button support. It is, 
however, not a complete solution. 


Response-time concerns 


Network latency — or the interval between user request and server response — needs to be 
considered carefully during Ajax development. Without clear feedback to the user,[8] smart 
preloading of data and proper handling of the XMLHttpRequest object, users might experience 
delays in the interface of the web application, something which they might not expect or 
understand. Additionally, when an entire page is rendered there is a brief moment of re- 
adjustment for the eye when the content changes. The lack of this re-adjustment with smaller 
portions of the screen changing makes the latency more apparent. The use of visual feedback 
(such as throbbers) to alert the user of background activity and/or preloading of content and data 
are often suggested solutions to these latency issues. 


Search engine optimization 
Websites that use Ajax to load data which should be indexed by search engines must be 


careful to provide equivalent Sitemaps data at a public, linked URL that the search engine can 
read, as search engines do not generally execute the JavaScript code required for Ajax 
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functionality. This problem is not specific to Ajax, as the same issue occurs with sites that 
provide dynamic data as a full-page refresh in response to, say, a form submit (the general 
problem is sometimes called the hidden, or deep web). 


Reliance on JavaScript 


Ajax relies on JavaScript, which is often implemented differently by different browsers or 
versions of a particular browser. Because of this, sites that use JavaScript may need to be tested 
in multiple browsers to check for compatibility issues. It's not uncommon to see JavaScript code 
written twice, one part for IE, another part for Mozilla compatibles, although this is less true with 
the release of IE7 and with the now-common use of JavaScript abstraction libraries like the 
Prototype JavaScript Framework. Such libraries abstract browser-specific differences from the 
web developer. 

The level of IDE support for JavaScript used to be poor, although is changing with more 
wide-spread use of tools like firebug, IE Developer Toolbar and Venkman. 

An issue also arises if the user has switched off JavaScript support in the browser, thus 
disabling the functionality built into the page. 


Web analytics 


Many web analytics solutions are based on the paradigm of a new page being loaded 
whenever new or updated content is displayed to the user, or to track a series of steps in a 
process such as a check-out. Since Ajax alters this process, care must be taken to account for 
how to instrument a page or a portion of a page so that it can be accurately tracked. Analytics 
systems which allow for the tracking of events other than a simple page view, such as the click 
of a button or link, are the ones most likely to be able to accommodate a site which heavily 
utilizes Ajax. 


Accessibility 


Non-Ajax users would ideally continue to load and manipulate the whole page as a fallback, 
enabling the developers to preserve the experience of users in non-Ajax environments (including 
all relevant accessibility concerns) while giving those with capable browsers a much more 
responsive experience. 


JavaScript library 


A JavaScript library is a library of pre-written JavaScript controls which allow for easier 
development of JavaScript-based applications, especially for AJAX and other web-centric 
technologies. 

While JavaScript, as first developed by Netscape (and later Mozilla), has long had a presence 
on the Web for many websites, it gained a particular pitch with the rise of the Web 2.0 era of 
computing, in which JavaScript, combined with multiple flavors of XML, became increasingly 
used for the development of user interfaces for applications, both web-based and desktop-based. 
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JavaScript was also combined with CSS to create Dynamic web pages, which have also become 
popular as a more accessible alternative to Flash -based websites. 

With the expanded demands for JavaScript, an easier means for programmers to develop 
such dynamic interfaces was needed. Thus, JavaScript libraries such as JQuery and Dojo Toolkit 
were developed, allowing for developers to concentrate more upon more distinctive applications 
of AJAX. This has led to other companies and groups, such as Microsoft and Yahoo! developing 
their own JavaScript-based user interface libraries, which find their way into the web 
applications developed by these companies. 

Almost all JavaScript libraries are released under either a copycenter or copyleft license to 
ensure license-free distribution, usage, and modification. 

Furthermore, some JavaScript libraries allow for easier ties between JavaScript and other 
languages, such as CSS, PHP, and Java. This is to ensure that JavaScript applications will be able 
to run seamlessly with other, lower-level languages. 


Google Web Toolkit 


The Google Web Toolkit is an open source toolkit allowing developers to create Ajax 
applications in the Java programming language [1]. GWT supports rapid client/server 
development and debugging in any Java IDE. In a subsequent deployment step, the GWT 
compiler translates a working Java application into equivalent JavaScript that programmatically 
manipulates a web browser's HTML DOM using DHTML techniques. GWT emphasizes 
reusable, efficient solutions to recurring Ajax challenges, namely asynchronous remote 
procedure calls, history management, bookmarking, and cross-browser portability. 


Features 


e Dynamic, reusable UI components 

e Simple RPC 

e Browser history management 

e Debugging using java debuggers 

e Browser compatible 

e JUnit integration 

e [nternationalization 

e Interoperability and fine-grained control 
e Google API Library: Google Gears support 
e Completely Open Source 


Available widgets 


As of December 2006, the Google Web Toolkit contains widgets for: 
e Panels (AbsolutePanel, CellPanel, DeckPanel, DockPanel, and many more) 
e Buttons 
e Checkboxes 
e Radio Buttons 
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e Tables (FlexTable, Grid, etc.) 
e Dialog Boxes 
e HTML primitives (Images, Hyperlinks, and raw HTML) 
e Menus and Menubars 
e Scrollpanes 
e Tabs 
e Trees 
Many common widgets not found in the Google Web Toolkit have been implemented in 
third-party libraries, such as the GWT Component Library. 
The rocket-gwt contains many extras which help developers author sophisticated GWT 
powered applications including: 
e templated based code generation using deferred binding 
e bean factories with aop support (on the client) 
e cometting of objects 
e json serialization 
e browser independent css manipulation 
e programmatic support for determining and changing (mouse) selections. 
e many additional widgets. 


History 
e GWT 1.0 May 17, 2006 

GWT 1.1 August 11, 2006 
GWT 1.2 November 16, 2006 
GWT 1.3 February 05, 2007 
GWT 1.4RC May 29, 2007 
GWT 1.4 GWT Version 1.4 Development Plan 

See Also 


e Ajax (programming) 


Smart client 


Smart Client is a new "buzzword" used in software development, generally referring to 
applications which: 
e are delivered over the web 
e do not require installation (or provide automated installation and updates) 
e automatically update without user action 
e have the look and feel of desktop applications 
The term "Smart Client" is meant to refer to simultaneously capturing the benefits of a "thin 
client" (zero-install, auto-update) and a "fat client" (high performance, high productivity). 
A "Smart Client" application can be created in several very different technologies. The 
original use of the term in the context of the web was Isomorphic Software's SmartClient product 
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(they own the SmartClient.com domain), which uses an Ajax-based, cross-browser cross- 
platform approach. Subsequently Microsoft began using the terminology to refer to .NET 
applications delivered via the Internet Explorer browser to Windows XP. The terms "Rich 
Internet Application" (RIA) and "rich web application" are essentially synonymous with "Smart 
Client", and are used to refer to several other technological approaches including Flash, Java 
applets and Webstart applications. 

The Smart Client approach came about because when businesses tried to develop web 
applications to replace their old desktop applications, user productivity decreased. This was 
because web-based user interfaces based on server-side HTML generation are typically not as 
responsive, have fewer hot keys and require more use of the mouse, etc. 

Smart Client applications bridge the gap between web applications and desktop applications. 
They provide the benefits of a web application (such as leveraging the internet and offering 
remote access to data) while still providing the snappy look and feel inherent to desktop 
applications. 


Reverse Ajax 


Reverse Ajax, not unlike DHTML, LAMP, Ajax and SPA, is not a technology in itself, but 
a term that refers to the use of a group of technologies together. These technologies include: 
e Ajax for handling the data on the client side in a smooth and interactive way, 
and passing data between server and client. 
e A technology for pushing server data to a browser 
o Comet, a connection between a server and client is kept open, by slowly 
loading a page in a hidden frame. 
o Piggyback, extra data is added (piggybacked) onto a normal client-server 
interaction. 
o Polling, the client repetitively queries (poll) the server. 
Reverse Ajax is different from Ajax, as Reverse Ajax is a suite of technologies for pushing 
data from a server to a client. These technologies are built upon an Ajax framework. 


Notes 


The Polling technique is not properly a Reverse Ajax solution, because in this case the server 
simply responds to a repeated direct ajax request. The result is the same that Reverse Ajax, but 
this technique is not too elegant to be used because it makes a lot of unnecessary traffic. 

Let's take a simple example: 

The broadcast message. 

We have 2 clients and 1 server: client] needs to send "hello" to all the other clients 


With Polling: 
e client1 sends the message "hello" 
e server receives the message "hello" 
e client2 polls the server 
e client2 receives the message "hello" 
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e client1 polls the server <-- unnecessary traffic starts 

e client1 receives the message "hello" 

e client2 polls the server 

e client2 receives the message "hello" <-- unnecessary traffic continues 
and these last lines repeat forever. 


With Comet: 
e client1 sends the message "hello" 
e server receives the message "hello" 
e server sends the message "hello" to all clients 
no unnecessary traffic. 


With PiggyBack: 
e client1 sends the message "hello" 
e server receives the message "hello" 
e client2 does any request to the server 
e server adds to the response the message "hello" 
no unnecessary traffic. 


See also 


e Ajax 
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Document Update Markup Language 


Document Update Markup Language (DUML) is an XML specification created by Brian 
Kardell to enable server-side logic DOM manipulation outside the context of conventional 
JavaScript functions. 

DUML supports a simpler approach to AJAX. With most current approaches, DOM 
manipulation is accomplished through arbitrarily complex client-side JavaScript. With the 
DUML approach, the browser makes a standard AJAX call to the server. Then DOM 
manipulation instructions (such as appending nodes, replacing nodes, etc.) are generated server- 
side as a DUML document, delivered to the browser, and finally interpreted by a simple DUML 
interpreter running on the web page. 

The net effect is to move complex DOM manipulation logic out of the web page (and out of 
any associated JavaScript files) and onto the server. This may be desirable in cases where a 
development team wishes to put this sort of logic in the hands of the software developers rather 
than having the web designers handle this. 

DUML itself is very simple, supporting a set of nine manipulations: appendChild, 
insertBefore, insertAfter, replace, remove. replaceContent, set-attribute, executeScript and 
queueScript. Since instructions map more or less directly to universally supported methods, the 
interpreter is correspondingly simple. 

The DUML specification itself is generic and not tied directly to HTML based applications 
and therefore can be used with any XML/DOM based user interface language as long as the 
client supports a language in which an interpreter can be implemented. 


History 


The name "DUML" originated with Kardell's son, who remarked that writing DOM 
manipulation code using complicated client-side JavaScript was "dumb". 


Comparison between AJAX and Flex 


Ajax and Adobe Flex are popular Web development technologies to create Rich Internet 
Applications. They can be used separately or in combination. Ajax developers often use an Ajax 
framework to speed up development. Flex developers use Adobe's tools for development. This 
article compares Ajax frameworks with Adobe Flex. 


About this Comparison 


Adobe Flex is a single-vendor product (proprietary to Adobe) while Ajax is a collection of 
techniques to create web applications that is not specific to a vendor. With Ajax the following 
options can be considered: 

e Custom Ajax (develop from scratch) 
e Open Source Frameworks 
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e Commercial Frameworks 
The comparison below will take these options into account when comparing Ajax with Flex. 


Other RIA Technologies 


There are more technologies that can be used for building RIAs, other than Ajax and Flex. A 
full list can be found in the ‘Methods and Techniques’ section in the Rich Internet Application 
article. 


Interoperability 


Ajax and Flex can also be used together. There are two tools available to facilitate this 
integration: the Flex Ajax Bridge and Ajax Data Services. 


Comparison of Product Capabilities 
Runtime 


Adobe Flex relies on the Flash 9 browser plug-in, which needs to be present in the browser of 
the website visitor. Ajax uses the various Internet browsers as its runtime. Some Ajax 
Frameworks have an additional JavaScript engine that abstracts away from differences in 
browser implementations: developers use the engine which in turn communicates with the 
browser. 

The benefit of using a proprietary plug-in as Flash 9 is the controlled runtime environment, 
which is identical across all web browsers. This make development easier and it allows Adobe to 
add additional features and to improve performance. The downside is that the plug-in needs to be 
installed, which can pose a problem in environments with locked-down operating systems. 
Adobe periodically publishes data on the market penetration of the Flash Player. Some critics of 
flash, such as Jakob Nielsen"!, state that reliance on a plug-in as a break with web standards, as 
the web browser is only used to launch the player which does not use web standards such as 
HTML, CSS and JavaScript. The Flash plug-in offers some support for HTML, CSS and an 
extended version of JavaScript (ECMAScript). 

The benefit of Ajax is that it runs in various web browsers, without the need for a plug-in. 
The downside is that there are differences between browser implementations of DOM, 
JavaScript and CSS, which can result in Ajax applications that work in one browser, but not in 
another. Ajax Frameworks mitigate this issue by offering a browser compatibility layer. Another 
downside of Ajax is that it requires JavaScript to be enabled in the web browser. It is possible to 
make Ajax applications degrade gracefully when JavaScript is disabled, but this requires 
additional application design. 


Development 
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Flex has a declarative development model, in which the application is defined with XML 
tags. Additional programming is done in ECMAScript. Styling can be added using CSS. Ajax 
can be programmed in JavaScript or with Ajax frameworks that offer a declarative programming 
model (e.g. Backbase). Ajax styling is also done with CSS. There are also server-side Ajax 
frameworks that allow Ajax programming with JavaServer Faces (e.g. Icesoft and Backbase) or 
.NET (e.g. Telerik and Infragistics). 

Adobe Flex has its own IDE: Adobe Builder, which is based on Eclipse. There are many 
IDEs that can be used for Ajax development. They include: 

e Generic IDEs such as Eclipse and Visual Studio.NET 
e Dedicated Ajax IDEs such as Aptana 
e IDEs bundled with Ajax Frameworks 

Some Ajax Frameworks develop their IDE on top of an existing IDE: for example, Backbase 
uses Eclipse. Other Ajax Frameworks develop their IDE in the web browser, often using their 
own framework, such as [[Tibco_Software|Tibco] GI. 

Adobe bundles a framework for automated testing with Flex, while for Ajax most existing 
web page testing tools can be used. 


Performance 


RIA performance consists of various aspects, such as the size for the initial download, the 
speed of the runtime, and speed of data manipulation. 

Flex applications have a minimum initial download size of approximately 125 kilobyte, 
which users may consider slow on dialup connections. Ajax programs can be very small, 
although most frameworks introduce a certain initial download size, ranging from several 
kilobytes to a megabyte. Flex applications are already compressed, while for Ajax, GZip 
compression on the web server is typically used to compress the files. 

The performance of Ajax applications is based on the performance of the browser, which can 
vary from one browser to another. Frameworks such as Backbase and Jackbe have optimized the 
performance of their framework across browsers. Adobe optimizes the Flash Runtime for 
performance, and therefore has more control over performance. One of the benefits is that Flex 
can handle large data sets on the client-side, while Ajax requires more optimization to get good 
performance, especially in Internet Explorer?!. The best Ajax frameworks can handle more 
client-side data than other frameworks, due to careful optimization strategies. 


Widgets 


Adobe Flex comes with many user interface widgets in different skins. Custom Ajax does not 
come with pre-built widgets, but many Ajax frameworks do, ranging from a couple of widgets to 
over 100. Some Ajax widgets are stand-alone, while others are tightly integrated into an Ajax 
framework, using the framework’s underlying functionality for event handling, drag-and-drop, 
resizing, sorting, and so on. An underlying framework makes it easier to customize or extend 
widgets, and to create completely new widgets. Some Ajax frameworks come with multiple 
skins for the widgets. 
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Extensibility 


Now, Flex SDK has been Open Source so it also can be extended by external code 
contributions. Create custom component in Flex is very easy. Ajax is typically more extensible, 
because Ajax runs in browsers instead of a proprietary runtime. Custom widgets can also be 
created with Ajax and Ajax frameworks simplify this task by providing reusable functionality. 


Multimedia 


Flex uses multimedia capabilities from Flash, such as vector graphics, animation, image 
manipulation, audio and video. Ajax does not have multimedia functionality built-in, but can use 
other browser functionalities and plug-ins such as SVG and Quicktime, or even the Flash plugin. 


Server Communication 


Ajax uses the web browser’s underlying communication mechanism, which is HTTP. This 
supports pulling data from the server, as well as pushing data to the client using Server Push. 
Remoting is supported by various Ajax frameworks such as DWR. Adobe Flex also supports 
binary communication. 


SEO and Web Analytics 


Ajax and Flex are often used to build web applications that are not very content-rich. 
However, if there is a fair amount of content that needs to be indexed by search engines, search 
engine optimization (SEO) techniques become relevant. Adobe Flex has limited support for SEO 
because content cannot be read directly by search engines, which instead use meta data and 
HTML shadow pages to index Flex applications. HTML within Ajax applications can be indexed 
by search engines, although search bots cannot follow links that are generated by JavaScript 
logic. Proper application design can make Ajax applications fully accessible for search engines. 
Web analytics typically works well with Ajax applications, as all HTML files that are loaded can 
be tracked, even if they are loaded into an existing page. What happens within a Flex application 
is hidden from web analytics tools. 


Accessibility 


Adobe Flex is Section 508 compliant. Some Ajax frameworks such as Bindows and 
Backbase are also Section 508 compliant, but most other Ajax applications are not. 


Security 
Both Flex and Ajax run in the browser’s sandbox, and are generally seen as equally secure, 


though the Flash plug-in can introduce additional security risks. Most Ajax code is plain text and 
thus easy to read: this could make it easier for attackers to find vulnerabilities. 
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Non-Functional Comparison 
User Experience 


Adobe has attempted to make Flex as user friendly as possible. However, critics of Flash 
such as Jakob Nielsen" claim that Flash-based applications have inherent usability issues 
because visitors are used to using web pages as interfaces for web applications. Critics of Flash 
usability limitations such as the following: 

e Flash apps have a different feel: the cursor looks different, fonts look different, 
links and right-click does not work as expected 

e Copy and paste of text does not work 

e Back-button does not always work as expected 

e Developers are more likely to introduce unintuitive interface concepts 

Flex proponents state that many of these issues can be addressed with proper application 
design and thorough usability tests. 


Standards Support 


Both Flex and Ajax support many web standards. Ajax uses a standards-based runtime (the 
web browser) that supports many standards out-of-the-box. Flex uses a proprietary runtime that 
has implemented support or partial support for many standards. Any standard that is available in 
web browsers is immediately available for Ajax applications, while Adobe has to implement 
these standards in its Flash runtime. 


Vendor Support 


Flex is supported by a single large vendor: Adobe. Ajax in general is not supported by a 
vendor, though some Ajax frameworks provide support and several open source Ajax projects 
are supported by their developer community on a voluntary basis. Most commercial Ajax 
vendors provide support plans, ranging from basic email support to 24x7 phone support with a 
SLA. 
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Web accessibility 


Web accessibility refers to the practice of making websites usable by people of all abilities 
and disabilities. When sites are correctly designed, developed and edited, all users can have 
equal access to information and functionality. For example, when a site is coded with 
semantically meaningful HTML, with textual equivalents provided for images and with links 
named meaningfully, this helps blind users using text-to-speech software and/or text-to-Braille 
hardware. When text and images are large and/or enlargable, it is easier for users with poor sight 
to read and understand the content. When links are underlined (or otherwise differentiated) as 
well as coloured, this ensures that colour blind users will be able to notice them. When clickable 
links and areas are large, this helps users who cannot control a mouse with precision. When 
pages are coded so that users can navigate by means of the keyboard alone, or a single switch 
access device alone, this helps users who cannot use a mouse or even a standard keyboard. When 
videos are closed captioned or a sign language version is available, deaf and hard of hearing 
users can understand video. When flashing effects are avoided or made optional, users prone to 
seizures caused by these effects are not put at risk. And when content is written in plain language 
and illustrated with instructional diagrams and animations, users with dyslexia and learning 
difficulties are better able to understand the content. When sites are correctly built and 
maintained, all of these users can be accommodated while not impacting on the usability of the 
site for non-disabled users. 

The needs that Web accessibility aims to address include: 

e Visual: Visual impairments including blindness, various common types of low 
vision and poor eyesight, various types of colour blindness; 

e Motor/Mobility: e.g. difficulty or inability to use the hands, including tremors, 
muscle slowness, loss of fine muscle control, etc., due to conditions such as 
Parkinson's Disease, muscular dystrophy, cerebral palsy, stroke; 

e Auditory: Deafness or hearing impairments, including individuals who are hard 
of hearing; 

e Seizures: Photoepileptic seizures caused by visual strobe or flashing effects. 

e Cognitive/Intellectual: Developmental disabilities, learning disabilities 
(dyslexia, dyscalculia, etc.), and cognitive disabilities of various origins, affecting 
memory, attention, developmental "maturity," problem-solving and logic skills, 
etc.; 


Assistive technologies used for web browsing 


Disabled users use assistive technologies such as the following to enable and assist web 
browsing: 

e Screen reader software, which can read out, using synthesised speech, either 
selected elements of what is being displayed on the monitor (helpful for users 
with reading or learning difficulties), or which can read out everything that is 
happening on the PC (used by blind and vision impaired users). 
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e Braille terminals, consisting of a Refreshable Braille display which renders text 
as Braille characters (usually by means of raising pegs through holes in a flat 
surface) and either a QWERTY or Braille keyboard. 

e Screen magnification software, which enlarges what is displayed on the 
computer monitor, making it easier to read for vision impaired users. 

e Speech recognition software that can accept spoken commands to the 
computer, or turn dictation into grammatically correct text - useful for those 
who have difficulty using a mouse or a keyboard. 

e Keyboard overlays which can make typing easier and more accurate for those 
who have motor control difficulties. 


Guidelines on accessible web design 


Web Content Accessibility Guidelines 


Main article: Web Content Accessibility Guidelines 
In 1999 the Web Accessibility Initiative, a project by the World Wide Web Consortium 
(W3C), published the Web Content Accessibility Guidelines WCAG 1.0. In recent years, these 
have been widely accepted as the definitive guidelines on how to create accessible websites. 
Since 2003, the WAI has been working on the second edition of these guidelines, the WCAG 
2.0, which aim to be up to date and more technology neutral. This is currently at the Working 
Draft stage. 


Criticism of WAI guidelines 


In articles such as WCAC 2.0: The new W3C guidelines evaluated, To Hell with WCAG 2.0 
and Testability Costs Too Much, the WAI has been criticised for allowing WCAG 1.0 to get 
increasingly out of step with today's technologies and techniques for creating and consuming 
web content, for the slow pace of development of WCAG 2.0, for making the new guidelines 
difficult to navigate and understand, and other argued failings. In one attempt to provide 
guidelines that are designed to be up to date, easier to understand, and more relevant and 
practical to typical web development projects, Joe Clark's WCAG Samurai project has published 
an unofficial set of errata to WCAG 1.0. 


Other guidelines 
Philippines 


As part of the Web Accessibility Initiatives in the Philippines, the government through the 
National Council for the Welfare of Disabled Persons (NCWDP) board approved the 
recommendation of forming an adhoc or core group of webmasters that will help in the 
implementation of the Biwako Millennium Framework set by the UNESCAP. 

The Philippines was also the place where the Interregional Seminar and Regional 
Demonstration Workshop on Accessible Information and Communications Technologies (ICT) 
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to Persons with Disabilities was held where eleven countries from Asia - Pacific were 
represented. The Manila Accessible Information and Communications Technologies Design 
Recommendations was drafted and adopted in 2003. 


United Kingdom 


In the UK, the Disability Rights Commission (DRC) in collaboration with BSI have 
published Pas 78 which outlines good practice in commissioning accessible websites. 


Legally required web accessibility 


A growing number of countries around the world have introduced legislation which either 
directly addresses the need for websites and other forms of communication to be accessible to 
people with disabilities, or which addresses the more general requirement for people with 
disabilities not to be discriminated against. 


Australia 


In 2000, an Australian blind man won a court case against the Sydney Organising Committee 
of the Olympic Games (SOCOG). This was the first successful case under Disability 
Discrimination Act 1992 because SOCOG had failed to make their official website, Sydney 
Olympic Games, adequately accessible to blind users. The Human Rights and Equal Opportunity 
Commission (HREOC) also published World Wide Web Access: Disability Discrimination Act 
Advisory Notes. All Governments in Australia also have policies and guidelines that require 
accessible public websites; Vision Australia maintain a complete list of Australian web 
accessibility policies. 


Ireland 


In Ireland, the Disability Act 2005 was supplemented with the National Disability Authority's 
Code of Practice on Accessible Public Services in July 2006. It is a practical guide to help all 
Government Departments and nearly 500 public bodies to comply with their obligations under 
the Disability Act 2005. 


United Kingdom 


In the UK, the Disability Discrimination Act 1995 (DDA) does not refer explicitly to website 
accessibility, but makes it illegal to discriminate against people with disabilities. The DDA 
applies to anyone providing a service; public, private and voluntary sectors. The Code of 
Practice: Rights of Access - Goods, Facilities, Services and Premises document published by the 
government's Disability Rights Commission to accompany the Act does refer explicitly to 
websites as one of the "services to the public" which should be considered covered by the Act. 


United States 
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In the U.S., the Section 508 of the Rehabilitation Act requires that Federal agencies and their 
contractors give disabled employees and members of the public access to information (including 
web sites) that is comparable to the access available to others; the Americans with Disabilities 
Act of 1990 prohibits discrimination on the basis of disability; and Section 225 of the 
Telecommunications Act of 1996 requires suppliers to make telecommunications products and 
services accessible unless not requiring significant difficulty or expense. It is complicated, and 
dependent on case law, exactly how the latter two apply to Web site accessibility. 

On September 7, 2006, Judge Marilyn Hall Patel ruled in National Federation of the Blind v. 
Target Corporation that a retailer with a physical storefront may be sued if its website is 
inaccessible to the blind. However, Judge Patel did not rule on the merits of the plaintiff's case, 
which will be adjudicated at a later date. 


Website accessibility audits 


A growing number of organisations, companies and consultants offer website accessibility 
audits. These audits, a type of system testing, identify accessibility problems that exist within a 
website, and provide advice and guidance on the steps that need to be taken to correct these 
problems. 

A range of methods are used to audit websites for accessibility: 

e Automated tools are available which can identify some of the problems that are 
present. 

e Expert technical reviewers, knowledgeable in web design technologies and 
accessibility, can review a representative selection of pages and provide 
detailed feedback and advice based on their findings. 

e User testing, usually overseen by technical experts, involves setting tasks for 
ordinary users to carry out on the website, and reviewing the problems these 
users encounter as they try to carry out the tasks. 

Each of these methods has its strengths and weaknesses: 

e Automated tools can process many pages in a relatively short length of time, but 
can only identify some of the accessibility problems that might be present in the 
website. 

e Technical expert review will identify many of the problems that exist, but the 
process is time consuming, and many websites are too large to make it possible 
for a person to review every page. 

e User testing combines elements of usability and accessibility testing, and is 
valuable for identifying problems that might otherwise be overlooked, but 
needs to be used knowledgeably to avoid the risk of basing design decisions on 
one user's preferences. 

Ideally, a combination of methods should be used to assess the accessibility of a website. 


See also 


e Web Interoperability 
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Web Content Accessibility Guidelines 


Web Content Accessibility Guidelines 


Web Content Accessibility Guidelines (WCAG) are part of a series of Web accessibility 
guidelines published by the W3C's Web Accessibility Initiative. They consist of a set of 
guidelines on making content accessible, primarily for disabled users, but also for all user agents, 
including highly limited devices, such as mobile phones. 


Priority levels 


The guidelines have three priority levels: 

e Priority 1: Web developers must satisfy these requirements, otherwise it will be 
impossible for one or more groups to access the Web content. Conformance to 
this level is described as "A". 

e Priority 2: Web developers should satisfy these requirements, otherwise some 
groups will find it difficult to access the Web content. Conformance to this level 
is described as "AA" or "double A". 

e Priority 3: Web developers may satisfy these requirements, in order to make it 
easier for some groups to access the Web content. Conformance to this level is 
described as "AAA" or "triple A". 


WCAG 1.0 


The WCAG 1.0 were published and became a W3C recommendation on May 5, 1999. 


WCAG 2.0 


The first working draft of what will become the WCAG 2.0 W3C Recommendation was 
published on January 25, 2001, the latest version on May 17, 2007. The five year process 
encouraged participation in editing (and responding to the hundreds of comments) by the 
Working Group, with diversity assured by inclusion of accessibility experts and members of the 
disability community. 

There has been some criticism"! depicting WCAG 2.0 as obscure, vague, and perhaps even a 
backwards step for Web accessibility, as well as criticism of the criticism.! 
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Web design 


Web design is a process of conceptualization, planning, modeling, and execution of 
electronic media delivery via Internet in the form of Markup language suitable for interpretation 
by Web browser and display as Graphical user interface (GUI). 

The intent of web design is to create a web site -- a collection of electronic files that reside on 
a web server/servers and present content and interactive features/interfaces to the end user in 
form of Web pages once requested. Such elements as text, bit-mapped images (GIFs, JPEGs, 
PNGs), forms can be placed on the page using HTML/XHTML/XML tags. Displaying more 
complex media (vector graphics, animations, videos, sounds) requires plug-ins such as Flash, 
QuickTime, Java run-time environment, etc. Plug-ins are also embedded into web page by using 
HTML/XHTML tags. 

Improvements in browsers' compliance with W3C standards prompted a widespread 
acceptance and usage of XHTML/XML in conjunction with Cascading Style Sheets (CSS) to 
position and manipulate web page elements and objects. Latest standards and proposals aim at 
leading to browsers’ ability to deliver a wide variety of media and accessibility options to the 
client possibly without employing plug-ins. 

Typically web pages are classified as static or dynamic. 

Static pages don’t change content and layout with every request unless a human (web 
master/programmer) manually updates the page. 

Dynamic pages adapt their content and/or appearance depending on _ end-user’s 
input/interaction or changes in the computing environment (user, time, database modifications, 
etc.) Content can be changed on the client side (end-user's computer) by using client-side 
scripting languages (JavaScript, JScript, Actionscript, etc.) to alter DOM elements (DHTML). 
Dynamic content is often compiled on the server utilizing server-side scripting languages (Perl, 
PHP, ASP, JSP, ColdFusion, etc.). Both approaches are usually used in complex applications. 

With growing specialization in the information technology field there is a strong tendency to 
draw a clear line between web design and web development. 


History 


Tim Berners-Lee, the inventor of the World Wide Web, published a website in August 
1991.[1] Berners-Lee was the first to combine Internet communication (which had been carrying 
email and the Usenet for decades) with hypertext (which had also been around for decades, but 
limited to browsing information stored on a single computer, such as interactive CD-ROM 
design). 

Websites are written in a markup language called HTML, and early versions of HTML were 
very basic, only giving websites basic structure (headings and paragraphs), and the ability to link 
using hypertext. This was new and different to existing forms of communication - users could 
easily navigate to other pages by following hyperlinks from page to page. 

As the Web and web design progressed, the markup language used to make it became more 
complex and flexible, giving the ability to add objects like images and tables to a page. Features 
like tables, which were originally intended to be used to display tabular information, were soon 
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subverted for use as invisible layout devices. With the advent of Cascading Style Sheets (CSS), 
table-based layout is increasingly regarded as outdated. Database integration technologies such 
as server-side scripting and design standards like CSS further changed and enhanced the way the 
Web is made. 

The introduction of Macromedia Flash (now Adobe Flash) into an already interactivity-ready 
scene has further changed the face of the Web, giving new power to designers and media 
creators, and offering new interactivity features to users, often at the expense of usability for 
persons with disabilities, search engine visibility and browser functions available to HTML. 


Web site design 


A Web site is a collection of information about a particular topic or subject. Designing a 
website is defined as the arrangement and creation of Web pages that in turn make up a website. 
A Web page consists of information for which the Web site is developed. A website might be 
compared to a book, where each page of the book is a web page. 

There are many aspects (design concerns) in this process, and due to the rapid development 
of the Internet, new aspects may emerge. For typical commercial Web sites, the basic aspects of 
design are: 

e The content: The substance, and information on the site should be relevant to 
the site and should target the area of the public that the website is concerned 
with. 

e The usability: The site should be user-friendly, with the interface and navigation 
simple and reliable. 

e The appearance: The graphics and text should include a single style that flows 
throughout, to show consistency. The style should be professional, appealing 
and relevant. 

e The visibility: The site must also be easy to find via most, if not all, major search 
engines and advertisement media. 

A Web site typically consists of text and images. The first page of a website is known as the 
Home page or Index. Some websites use what is commonly called a Splash Page. Splash pages 
might include a welcome message, language/region selection, or disclaimer. Each web page 
within a Web site is an HTML file which has its own URL. After each Web page is created, they 
are typically linked together using a navigation menu composed of hyperlinks. Faster browsing 
speeds have led to shorter attention spans and more demanding online visitors and this has 
resulted in less use of Splash Pages, particularly where commercial websites are concerned. 

Once a Web site is completed, it must be published or uploaded in order to be viewable to the 
public over the internet. This may be done using an FTP client. Once published, the Web master 
may use a variety of techniques to increase the traffic, or hits, that the website receives. This may 
include submitting the Web site to a search engine such as Google or Yahoo, exchanging links 
with other Web sites, creating affiliations with similar Web sites, etc. 


Multidisciplinary requirements 
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Web site design crosses multiple disciplines of information systems, information technology 
and communication design. The website is an information system whose components are 
sometimes classified as front-end and back-end. The observable content (e.g page layout, user 
interface, graphics, text, audio) is known as the front-end. The back-end comprises the 
organization and efficiency of the source code, invisible scripted functions, and the server-side 
components that process the output from the front-end. Depending on the size of a Web 
development project, it may be carried out by a multi-skilled individual (sometimes called a web 
master), or a project manager may oversee collaborative design between group members with 
specialized skills. 


Issues 


As in most collaborative designs, there are conflicts between differing goals and methods of 
web site designs. These are a few of the ongoing ones. 


Lack of collaboration in design 


In the early stages of the web, there wasn't as much collaboration between web designs and 
larger advertising campaigns, customer transactions, social networking, intranets and extranets as 
there is now. Web pages were mainly static online brochures disconnected from the larger 
projects. 

Many web pages are still disconnected from larger projects. Special design considerations are 
necessary for use within these larger projects. These design considerations are often overlooked, 
especially in cases where there is a lack of leadership, understanding or concern for the larger 
project to facilitate collaboration. This often results in unhealthy competition or compromise 
between departments, and less than optimal use of web pages. 


Liquid versus fixed layouts 


On the web the designer has no control over several factors, including the size of the browser 
window, the web browser used, the input devices used (mouse, touch screen, voice command, 
text, cell phone number pad, etc.) and the size and characteristics of available fonts. 

Some designers choose to control the appearance of the elements on the screen by using 
specific width designations. This control may be achieved through the use of a HTML table- 
based design, or through the use of CSS. Whenever the text, images, and layout of a design do 
not change as the browser changes, this is referred to as a fixed width design. Proponents of 
fixed width design prefer the control over the look and feel of the site and the precision 
placement of objects on the page. Other designers choose a liquid design. A liquid design is one, 
like Wikipedia, where the design moves to flow content into the whole screen, or a portion of the 
screen, no matter what the size of the browser window. Proponents of liquid design prefer to use 
all the screen space available. Liquid design can be achieved through the use of CSS, by 
avoiding styling the page altogether, or by using HTML tables set to a percentage of the page. 
Both liquid and fixed design developers must make decisions about how the design should 
degrade on higher and lower screen resolutions. Sometimes the pragmatic choice is made to flow 
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the design between a minimum and a maximum width. This allows the designer to avoid coding 
for the browser choices making up the long tail, while still using all available screen space. 

Similar to liquid layout is the optional fit to window feature with Adobe Flash content. This is 
a fixed layout that optimally scales the content of the page without changing the arrangement or 
text wrapping when the browser is resized. 


Flash 


Adobe Flash (formerly Macromedia Flash) is a_ proprietary, robust graphics 
animation/application development program used to create and deliver dynamic content, media 
(such as sound and video), and interactive applications over the web via the browser. 

Flash is not a standard produced by a vendor-neutral standards organization like most of the 
core protocols and formats on the Internet. Flash is much more restrictive than the open HTML 
format, though, requiring a proprietary plugin to be seen, and it does not integrate with most web 
browser UI features like the "Back" button unless a hyperlink is programmed to link a new html 
page from the Flash file, in which case the animation of the previous page would reset. However, 
those restrictions may be irrelevant depending on the goals of the web site design. 

According to a study ”!, 98% of US Web users have the Flash Player installed °!, with 45%- 
56%"! (depending on region) having the latest version. Numbers vary depending on the detection 
scheme and research demographics"). 

Many graphic artists use Flash because it gives them exact control over every part of the 
design, and anything can be animated and generally "jazzed up". Some application designers 
enjoy Flash because it lets them create applications that don't have to be refreshed or go to a new 
web page every time an action occurs. Flash can use embedded fonts instead of the standard 
fonts installed on most computers. There are many sites which forego HTML entirely for Flash. 
Other sites may use Flash content combined with HTML as conservatively as gifs or jpegs would 
be used, but with smaller vector file sizes and the option of faster loading animations. Flash may 
also be used to protect content from unauthorized duplication or searching. 

Flash detractors claim that Flash websites tend to be poorly designed, and often use 
confusing and non-standard user-interfaces. Up until recently, search engines have been unable 
to index Flash objects, which has prevented sites from having their contents easily found. This is 
because many search engine crawlers rely on text to index websites. It is possible to specify 
alternate content to be displayed for browsers that do not support Flash. Using alternate content 
also helps search engines to understand the page, and can result in much better visibility for the 
page. However, the vast majority of Flash websites are not disability accessible (for screen 
readers, for example) or Section 508 compliant. An additional issue is that sites which commonly 
use alternate content for search engines to their human visitors are usually judged to be 
spamming search engines and are automatically banned. 

The most recent incarnation of Flash's scripting language (called "ActionScript", which is an 
ECMA language similar to JavaScript) incorporates long-awaited usability features, such as 
respecting the browser's font size and allowing blind users to use screen readers. Actionscript 2.0 
is an Object-Oriented language, allowing the use of CSS, XML, and the design of class-based 
web applications. 
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CSS versus tables 


For more details on this topic, see Tableless web design. 

Back when Netscape Navigator 4 dominated the browser market, the popular solution 
available for designers to lay out a Web page was by using tables. Often even simple designs for 
a page would require dozens of tables nested in each other. Many web templates in 
Dreamweaver and other WYSIWYG editors still use this technique today. Navigator 4 didn't 
support CSS to a useful degree, so it simply wasn't used. 

After the browser wars were over, and Internet Explorer dominated the market, designers 
started turning toward CSS as an alternate means of laying out their pages. CSS proponents say 
that tables should be used only for tabular data, not for layout. Using CSS instead of tables also 
returns HTML to a semantic markup, which helps bots and search engines understand what's 
going on in a web page. All modern Web browsers support CSS with different degrees of 
limitations. 

However, one of the main points against CSS is that by relying on it exclusively, control is 
essentially relinquished as each browser has its own quirks which result in a slightly different 
page display. This is especially a problem as not every browser supports the same subset of CSS 
rules. For designers who are used to table-based layouts, developing Web sites in CSS often 
becomes a matter of trying to replicate what can be done with tables, leading some to find CSS 
design rather cumbersome due to lack of familiarity. For example, at one time it was rather 
difficult to produce certain design elements, such as vertical positioning, and full-length footers 
in a design using absolute positions. With the abundance of CSS resources available online 
today, though, designing with reasonable adherence to standards involves little more than 
applying CSS 2.1 or CSS 3 to properly structured markup. 

These days most modern browsers have solved most of these quirks in CSS rendering and 
this has made many different CSS layouts possible. However, some people continue to use old 
browsers, and designers need to keep this in mind, and allow for graceful degrading of pages in 
older browsers. Most notable among these old browsers are Internet Explorer 5 and 5.5, which, 
according to some web designers, are becoming the new Netscape Navigator 4 — a block that 
holds the World Wide Web back from converting to CSS design. However, the W3 Consortium 
has made CSS in combination with XHTML the standard for web design. 


How it Looks vs. How it Works 


Some web developers have a graphic arts background and may pay more attention to how a 
page looks than considering other issues such as how visitors are going to find the page via a 
search engine. Some might rely more on advertising than search engines to attract visitors to the 
site. On the other side of the issue, search engine optimization consultants (SEOs) obsess about 
how well a web site works technically and textually: how much traffic it generates via search 
engines, and how many sales it makes, assuming looks don't contribute to the sales. As a result, 
the designers and SEOs often end up in disputes where the designer wants more ‘pretty’ graphics, 
and the SEO wants lots of 'ugly' keyword-rich text, bullet lists, and text links. One could argue 
that this is a false dichotomy due to the possibility that a web design may integrate the two 
disciplines for a collaborative and synergistic solution. Because some graphics serve 
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communication purposes in addition to aesthetics, how well a site works may depend on the 
graphic designer's visual communication ideas as well as the SEO considerations. 

Another problem when using lots of graphics on a page is that download times can be greatly 
lengthened, often irritating the user. This has become less of a problem as the internet has 
evolved with high-speed internet and the use of vector graphics. This is an engineering challenge 
to increase bandwidth in addition to an artistic challenge to minimize graphics and graphic file 
sizes. This is an on-going challenge as increased bandwidth invites increased amounts of content. 


Accessible Web design 


Main article: Web accessibility 

Accessible Web design is the art of creating webpages that are accessible to everyone, using 
any device. It is especially important so that people with disabilities - whether due to accident, 
disease or old age - can access the information in Web pages and be able to navigate through the 
website. 

To be accessible, web pages and sites must conform to certain accessibility principles. These 
can be grouped into the following main areas: 

e use semantic markup that provides a meaningful structure to the document (i.e. 
web page) 

e Semantic markup also refers to semantically organizing the web page structure 
and publishing web services description accordingly so that they can be 
recognised by other web services on different web pages. Standards for 
semantic web are set by IEEE 

e use avalid markup language that conforms to a published DTD or Schema 

e provide text equivalents for any non-text components (e.g. images, multimedia) 

e use hyperlinks that make sense when read out of context. (e.g. avoid "Click 
Here.") 

e don't use frames 

e use CSS rather than HTML Tables for layout. 

e author the page so that when the source code is read line-by-line by user agents 
(such as a screen readers) it remains intelligible. (Using tables for design will 
often result in information that is not.) 

However, W3C permits an exception where tables for layout either make sense when 
linearized or an alternate version (perhaps linearized) is made available. 


Website Planning 


Before creating and uploading a website, it is important to take the time to plan exactly what 
is needed in the website. Thoroughly considering the audience or target market, as well as 
defining the purpose and deciding what content will be developed are extremely important. 


Purpose 
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It is essential to define the purpose of the website as one of the first steps in the planning 
process. A purpose statement should show focus based on what the website will accomplish and 
what the users will get from it. A clearly defined purpose will help the rest of the planning 
process as the audience is identified and the content of the site is developed. Setting short and 
long term goals for the website will help make the purpose clear and plan for the future when 
expansion, modification, and improvement will take place. Also, goal-setting practices and 
measurable objectives should be identified to track the progress of the site and determine 
success. 


Audience 


Defining the audience is a key step in the website planning process. The audience is the 
group of people who are expected to visit your website — the market being targeted. These people 
will be viewing the website for a specific reason and it is important to know exactly what they 
are looking for when they visit the site. A clearly defined purpose or goal of the site as well as an 
understanding of what visitors want to do/feel when they come to your site will help to identify 
the target audience. Upon considering who is most likely to need/use the content, a list of 
characteristics common to the users such as: 

e Audience Characteristics 
e Information Preferences 
e Computer Specifications 
e Web Experience 

Taking into account the characteristics of the audience will allow an effective website to be 
created that will deliver the desired content to the target audience. 


Content 


Content evaluation and organization requires that the purpose of the website be clearly 
defined. Collecting a list of the necessary content then organizing it according to the audience's 
needs is a key step in website planning. In the process of gathering the content being offered, any 
items that do not support the defined purpose or accomplish target audience objectives should be 
removed. It is a good idea to test the content and purpose on a focus group and compare the 
offerings to the audience needs. The next step is to organize the basic information structure by 
categorizing the content and organizing it according to user needs. Each category should be 
named with a concise and descriptive title that will become a link on the website. Planning for 
the site's content ensures that the wants/needs of the target audience and the purpose of the site 
will be fulfilled. 


Compatibility and restrictions 
Because of the market share of modern browsers (depending on your target market), the 
compatibility of your website with the viewers is restricted. For instance, a website that is 


designed for the majority of websurfers will be limited to the use of valid XHTML 1.0 Strict or 
older, Cascading Style Sheets Level 1, and 1024x768 display resolution. This is because Internet 
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Explorer is not fully W3C standards compliant with the modularity of XHTML 1.1 and the 
majority of CSS beyond 1. A target market of more alternative browser (e.g. Firefox and Opera) 
users allow for more W3C compliance and thus a greater range of options for a web designer. 

Another restriction on webpage design is the use of different Image file formats. The 
majority of users can support GIF, JPEG, and PNG (with restrictions). Again Internet Explorer is 
the major restriction here, not fully supporting PNG's advanced transparency features, resulting 
in the GIF format still being the most widely used graphic file format for transparent images. 

Many website incompatibilities go unnoticed by the designer and unreported by the users. 
The only way to be certain a website will work on a particular platform is to test it on that 
platform. 


Planning documentation 


Documentation is used to visually plan the site while taking into account the purpose, 
audience and content, to design the site structure, content and interactions that are most suitable 
for the website. Documentation may be considered a prototype for the website — a model which 
allows the website layout to be reviewed, resulting in suggested changes, improvements and/or 
enhancements. This review process increases the likelihood of success of the website. 

First, the content is categorized and the information structure is formulated. The information 
structure is used to develop a document or visual diagram called a site map. This creates a visual 
of how the web pages will be interconnected, which helps in deciding what content will be 
placed on what pages. There are three main ways of diagramming the website structure: 

e Linear Website Diagrams will allow the users to move in a predetermined 
sequence; 

e Hierarchical structures (of Tree Design Website Diagrams) provide more than 
one path for users to take to their destination; 

e Branch Design Website Diagrams allow for many interconnections between 
web pages such as hyperlinks within sentences. 

In addition to planning the structure, the layout and interface of individual pages may be 
planned using a storyboard. In the process of storyboarding, a record is made of the description, 
purpose and title of each page in the site, and they are linked together according to the most 
effective and logical diagram type. Depending on the number of pages required for the website, 
documentation methods may include using pieces of paper and drawing lines to connect them, or 
creating the storyboard using computer software. 

Some or all of the individual pages may be designed in greater detail as a website wireframe, 
a mock up model or comprehensive layout of what the page will actually look like. This is often 
done in a graphic program, or layout design program. The wireframe has no working 
functionality, only planning. 


See also 


e Website architecture 
e Website buildera 
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Cascading Style Sheets | Printer friendly | Brochureware | Digital strategy | DOM scripting | 
Fahrner Image Replacement | Microformat | Progressive enhancement | Rollover | Spacer GIF | 
Techniques for creating a User Centered Design | URL redirection | Web Interoperability | Web 

modeling | Web template | Web-safe fonts | Website architecture | Website wireframe 


Cascading Style Sheets 


In web development, Cascading Style Sheets (CSS) is a stylesheet language used to 
describe the presentation of a document written in a markup language. Its most common 
application is to style web pages written in HTML and XHTML, but the language can be applied 
to any kind of XML document, including SVG and XUL. 

CSS is used by both the authors and readers of web pages to define colors, fonts, layout, and 
other aspects of document presentation. It is designed primarily to enable the separation of 
document content (written in HTML or a similar markup language) from document presentation 
(written in CSS). This separation can improve content accessibility, provide more flexibility and 
control in the specification of presentational characteristics, and reduce complexity and repetition 
in the structural content. CSS can also allow the same markup page to be presented in different 
styles for different rendering methods, such as on-screen, in print, by voice (when read out by a 
speech-based browser or screen reader) and on Braille-based, tactile devices. CSS specifies a 
priority scheme to determine which style rules apply if more than one rule matches against a 
particular element. In this so-called cascade, priorities or weights are calculated and assigned to 
rules, so that the results are predictable. 

The CSS specifications are maintained by the World Wide Web Consortium (W3C). Internet 
media type (MIME type) text/css is registered for use with CSS by RFC 2318 (March 1998). 


Syntax 


CSS has a simple syntax, and uses a number of English keywords to specify the names of 
various style properties. 

A style sheet consists of a list of rules. Each rule or rule-set consists of one or more selectors 
and a declaration block. A declaration-block consists of a list of semicolon-separated 
declarations in curly braces. Each declaration itself consists of a property, a colon (:), a value, 
then a semi-colon (;)."! 

In CSS, selectors are used to declare which elements a style applies to, a kind of match 
expression. Selectors may apply to all elements of a specific type, or only those elements which 
match a certain attribute; elements may be matched depending on how they are placed relative to 
each other in the markup code, or on how they are nested within the document object model. 

In addition to these, a set of pseudo-classes can be used to define further behavior. Probably 
the best-known of these is :hover, which applies a style only when the user ‘points to' the visible 
element, usually by holding the mouse cursor over it. It is appended to a selector as in a:hover or 
#elementid:hover. Other pseudo-classes and pseudo-elements are, for example, :first-line, 
‘visited or :before. A special pseudo-class is :lang(c), where the style would be applied on an 
element only if it is in language "c". 
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A pseudo-class selects entire elements, such as :link or :visited, whereas a pseudo-element 
makes a selection that may consist of partial elements, such as :first-line or :first-letter. 

Selectors may be combined in other ways too, especially in CSS 2.1, to achieve greater 
specificity and flexibility."”! 


Use of CSS 


Prior to CSS, nearly all of the presentational attributes of HTML documents were contained 
within the HTML markup; all font colors, background styles, element alignments, borders and 
sizes had to be explicitly described, often repeatedly, within the HTML. CSS allows authors to 
move much of that information to a separate stylesheet resulting in considerably simpler HTML 
markup. 

Headings (hl elements), sub-headings (h2), sub-sub-headings (h3) etc. are defined 
structurally using HTML. In print and on the screen, choice of font, size, color and emphasis for 
these elements is presentational. 

Prior to CSS, document authors who wanted to assign such typographic characteristics to, 
say, all h2 headings had to use the HTML font and other presentational elements for each 
occurrence of that heading type. The additional presentational markup in the HTML made 
documents more complex, and generally more difficult to maintain. To render all h2 tags in this 
manner, the markup had to be repeated for each heading. In CSS, presentation is separated from 
structure. In print, CSS can define color, font, text alignment, size, borders, spacing, layout and 
many other typographic characteristics. It can do so independently for on-screen and printed 
views. CSS also defines non-visual styles such as the speed and emphasis with which text is read 
out by aural text readers. The W3C now considers the advantages of CSS for defining all aspects 
of the presentation of HTML pages to be superior to other methods. It has therefore deprecated 
the use of all the original presentational HTML markup. 


Style sheet sources 


CSS information can be provided by various sources. CSS style information can be either 
attached as a separate document or embedded in the HTML document. Multiple style sheets can 
be imported, and alternative style sheets can be specified so that the user can choose between 
them. Different styles can be applied depending on the output device being used; for example, 
the screen version can be quite different from the printed version, so that authors can tailor the 
presentation appropriately for each medium. 

e Author styles (style information provided by the web page author), in the form 
of 
o external stylesheets, i.e. a separate CSS-file referenced from the document 
o embedded style, blocks of CSS information inside the HTML document itself 
o inline styles, inside the HTML document, style information on a single 
element, specified using the "style" attribute. 
e User style 
o a local CSS-file specified by the user using options in the web browser, and 
acting as an override, to be applied to all documents. 
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e User agent style 
o the default style sheet applied by the user agent, e.g. the browser's default 
presentation of elements. 

One of the goals of CSS is also to allow users a greater degree of control over presentation; 
those who find the red italic headings difficult to read may apply other style sheets to the 
document. Depending on their browser and the web site, a user may choose from various 
stylesheets provided by the designers, may remove all added style and view the site using their 
browser's default styling or may perhaps override just the red italic heading style without altering 
other attributes. 

File highlightheaders.css containing: 


h1 { 
color: white; background: orange !important; } h2 { 
color: white; background: green !important; } 


Such a file is stored locally and is applicable if that has been specified in the browser options. 
"lumportant" means that it prevails over the author specifications. 


History 


Style sheets have existed in one form or another since the beginnings of SGML in the 1970s. 
Cascading Style Sheets were developed as a means for creating a consistent approach to 
providing style information for web documents. 

As HTML grew, it came to encompass a wider variety of stylistic capabilities to meet the 
demands of web developers. This evolution gave the designer more control over site appearance 
but at the cost of HTML becoming more complex to write and maintain. Variations in web 
browser implementations made consistent site appearance difficult, and users had less control 
over how web content was displayed. 

To improve the capabilities of web presentation, nine different style sheet languages were 
proposed to the W3C's www-style mailing list. Of the nine proposals, two were chosen as the 
foundation for what became CSS: Cascading HTML Style Sheets (CHSS) and Stream-based 
Style Sheet Proposal (SSP). Firstly, Hakon Wium Lie (now the CTO of Opera Software) 
proposed Cascading HTML Style Sheets (CHSS) in October 1994, a language which has some 
resemblance to today's CSS. Bert Bos was working on a browser called Argo which used its own 
style sheet language, Stream-based Style Sheet Proposal (SSP). Lie and Bos worked together to 
develop the CSS standard (the 'H' was removed from the name because these style sheets could 
be applied to other markup languages besides HTML). 

Unlike existing style languages like DSSSL and FOSI, CSS allowed a document's style to be 
influenced by multiple style sheets. One style sheet could inherit or "cascade" from another, 
permitting a mixture of stylistic preferences controlled equally by the site designer and user. 

Hakon's proposal was presented at the "Mosaic and the Web" conference in Chicago, Illinois 
in 1994, and again with Bert Bos in 1995. Around this time, the World Wide Web Consortium 
was being established; the W3C took an interest in the development of CSS, and it organized a 
workshop toward that end chaired by Steven Pemberton. This resulted in W3C adding work on 
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CSS to the deliverables of the HTML editorial review board (ERB). Hakon and Bert were the 
primary technical staff on this aspect of the project, with additional members, including Thomas 
Reardon of Microsoft, participating as well. By the end of 1996, CSS was ready to become 
official, and the CSS level 1 Recommendation was published in December. 

Development of HTML, CSS, and the DOM had all been taking place in one group, the 
HTML Editorial Review Board (ERB). Early in 1997, the ERB was split into three working 
groups: HTML Working group, chaired by Dan Connolly of W3C, DOM Working group, 
chaired by Lauren Wood of SoftQuad, and CSS Working group, chaired by Chris Lilley of W3C. 

The CSS Working Group began tackling issues that had not been addressed with CSS level 1, 
resulting in the creation of CSS level 2 on November 4, 1997. It was published as a W3C 
Recommendation on May 12, 1998. CSS level 3, which was started in 1998, is still under 
development as of 2006. 

In 2005 the CSS Working Groups decided to enforce the requirements for standards more 
strictly. This meant that already published standards like CSS 2.1, CSS 3 Selectors and CSS 3 
Text were pulled back from Candidate Recommendation to Working Draft level. 


Difficulty with adoption 


Although the CSS1 specification was completed in 1996 and Microsoft's Internet Explorer 3 
was released in that year featuring some limited support for CSS, it would be more than three 
years before any web browser achieved near-full implementation of the specification. Internet 
Explorer 5.0 for the Macintosh, shipped in March of 2000, was the first browser to have full 
(better than 99 percent) CSS1 support, surpassing Opera, which had been the leader since its 
introduction of CSS support fifteen months earlier. Other browsers followed soon afterwards, 
and many of them additionally implemented parts of CSS2. As of July 2006, no browser has 
fully implemented CSS2, with implementation levels varying (see Comparison of layout engines 
(CSS)). 

Even though early browsers such as Internet Explorer 3 and 4, and Netscape 4.x had support 
for CSS, it was typically incomplete and afflicted with serious bugs. This was a serious obstacle 
for the adoption of CSS. 

When later 'version 5' browsers began to offer a fairly full implementation of CSS, they were 
still incorrect in certain areas and were fraught with inconsistencies, bugs and other quirks. The 
proliferation of such CSS-related inconsistencies and even the variation in feature support has 
made it difficult for designers to achieve a consistent appearance across platforms. Some authors 
commonly resort to using CSS hacks, workarounds, and CSS filters in order to obtain consistent 
results across web browsers and platforms. 

Problems with browsers’ patchy adoption of CSS along with errata in the original 
specification led the W3C to revise the CSS2 standard into CSS2.1, which may be regarded as 
something nearer to a working snapshot of current CSS support in HTML browsers. Some CSS2 
properties which no browser had successfully implemented were dropped, and in a few cases, 
defined behaviours were changed to bring the standard into line with the predominant existing 
implementations. CSS2.1 became a Candidate Recommendation on February 25, 2004, but was 
pulled back to Working Draft status on June 13, 2005, and only returned to Candidate 
Recommendation status on July 19, 2007. 
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As of 2006 some older web servers are still configured to serve documents with the filename 
extension .css as mime type application/x-pointplus. This is because the Net-Scene company was 
selling PointPlus Maker to convert PowerPoint files into Compact Slide Show files (using the 
.css extension) and web servers were configured to signal to client browsers that these .css files 
were x-pointplus media type. Since the plugin was listed in the directory for Netscape Navigator 
3.0, the popular Netscape Enterprise Server was distributed with this mapping pre-configured. 
When reading external style sheets some web browsers try to compensate for the misconfigured 
web servers by treating the PointPlus media type as a text/css media type instead, but some 
(notably Mozilla Firefox) comply with the media type and will not render the external CSS file 
as a Style sheet. 


Variations 


CSS has various levels and profiles. Each level of CSS builds upon the last, typically adding 
new features and are typically denoted as CSS1, CSS2, and CSS3. Profiles are typically a subset 
of one or more levels of CSS built for a particular device or user interface. Currently there are 
profiles for mobile devices, printers, and television sets. Profiles should not be confused with 
media types which were added in CSS2. 

The first CSS specification to become an official W3C Recommendation is CSS level 1, 
published in December 1996.') Among its capabilities are support for: 

e Font properties such as typeface and emphasis 

e Color of text, backgrounds, and other elements 

e Text attributes such as spacing between words, letters, and lines of text 
e Alignment of text, images, tables and other elements 

e Margin, border, padding, and positioning for most elements 

e Unique identification and generic classification of groups of attributes 

The W3C maintains the CSS1 Recommendation."! 

CSS level 2 was developed by the W3C and published as a Recommendation in May 1998. A 
superset of CSS1, CSS2 includes a number of new capabilities like absolute, relative, and fixed 
positioning of elements, the concept of media types, support for aural style sheets and 
bidirectional text, and new font properties such as shadows. The W3C maintains the CSS2 
Recommendation"! 

CSS level 2 revision 1 or CSS 2.1 fixes errors in CSS2, removes poorly-supported features 
and adds already-implemented browser extensions to the specification. While it was a Candidate 
Recommendation for several months, on 15 June 2005 it was reverted to a working draft for 
further review.[6] It was returned to Candidate Recommendation status on 19 July 2007. 

CSS level 3 is currently under development.’ The W3C maintains a CSS3 progress report. 
As with the evolving XHTML specification, CSS3 is modularized and will consist of several 
separate Recommendations. An Introduction to CSS3 roadmap will be the starting point. 


Browser support 


A'CSS filter'[8] is a coding technique that aims to effectively hide or show parts of the CSS 
to different browsers, either by exploiting CSS-handling quirks or bugs in the browser, or by 
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taking advantage of lack of support for parts of the CSS specifications. Using CSS filters, some 
designers have gone as far as delivering entirely different CSS to certain browsers in order to 
ensure that designs are rendered as expected. Because very early web browsers were either 
completely incapable of handling CSS, or render CSS very poorly, designers today often 
routinely use CSS filters that completely prevent these browsers from accessing any of the CSS. 

An example of a well-known CSS browser bug is the Internet Explorer box model bug, 
where box widths are interpreted incorrectly in several versions of the browser, resulting in 
blocks which are too narrow when viewed in Internet Explorer, but correct in standards- 
compliant browsers. The bug can be avoided in Internet Explorer 6 by using the correct doctype 
in (X)HTML documents. CSS hacks and filters are used to compensate for bugs such as this, just 
one of hundreds of CSS bugs that have been documented in various versions of Netscape, 
Mozilla Firefox, Opera, and Internet Explorer (including Internet Explorer 7)"'. [1] 

Even when the availability of CSS-capable browsers made CSS a viable technology, the 
adoption of CSS was still held back by designers’ struggles with browsers’ incorrect CSS 
implementation and patchy CSS support. Even today, these problems continue to make the 
business of CSS design more complex and costly than it should be, and cross-browser testing 
remains a necessity. Other reasons for continuing non-adoption of CSS are: its perceived 
complexity, authors' lack of familiarity with CSS syntax and required techniques, poor support 
from authoring tools, the risks posed by inconsistency between browsers and the increased costs 
of testing. 

Currently there is strong competition between Mozilla's Gecko layout engine, the WebKit 
layout engine used in Apple's Safari, Opera's Presto layout engine, and the KHTML engine used 
in KDE's Konqueror browser - each of them is leading in different aspects of CSS. As of 2007, 
Internet Explorer remains the worst at rendering CSS as judged by World Wide Web Consortium 
standards ([{2] as linked from [3]). 


Limitations 


Some noted disadvantages of using "pure" CSS include: 

e Different browsers will render CSS layout differently as a result of browser bugs 
or lack of support for CSS features. For example Microsoft Internet Explorer, 
whose older versions lacked many CSS 2.1 properties, misinterpreted a 
significant number of important properties, such as "width", "height", and "float". 
Numerous so-called CSS "hacks" must be implemented to achieve consistent 
layout among the most popular or commonly used user agents (browsers). Pixel 
precise layouts can sometimes be impossible to achieve cross-browser. 

e For larger sites, style sheets can grow to become extremely long and complex 
making editing and overall site management somewhat more difficult and 
tedious than if a basic table layout were used. 

e Although the CSS standards have been in place for years, websites using CSS 
layout have been slow to catch on with many webmasters who have not found 
the need (or desire) to update their sites with the latest standards. 

Selectors are unable to ascend 
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CSS offers no way to select a parent or ancestor of element that satisfies certain criteria. 
A more advanced selector scheme (such as XPath) would enable more sophisticated 
stylesheets. However, the major reasons for the CSS Working Group rejecting proposals 
for parent selectors are related to browser performance and incremental rendering 
issues. 

One block declaration cannot explicitly inherit from another 
Inheritance of styles is performed by the browser based on the containment hierarchy 
of DOM elements and the specificity of the rule selectors, as suggested by the section 
6.4.1 of the CSS2 specification [4]. Only the user of the blocks can refer to them by 
including class names into the class attribute of a DOM element. 

Vertical control limitations 
While horizontal placement of elements is generally easy to control, vertical placement 
is frequently unintuitive, convoluted, or impossible. Simple tasks, such as centering an 
element vertically or getting a footer to be placed no higher than bottom of viewport, 
either require complicated and unintuitive style rules, or simple but widely 
unsupported rules. 

Absence of expressions 
There is currently no ability to specify property values as simple expressions (such as 
margin-left: 10% - 3em + 4px;). However, work on a calc() value to address this limitation 
has been discussed by the CSS WG, and Internet Explorer 5 and all later versions 
support a proprietary expression() statement [5], with similar functionality. 

Lack of orthogonality 
Multiple properties often end up doing the same job. For instance, position, display and 
float specify the placement model, and most of the time they cannot be combined 
meaningfully. A display: table-cell element cannot be floated or given position: relative, 
and an element with float: left should not react to changes of display. 

Margin collapsing 
Margin collapsing is, while well-documented and useful, also complicated and is 
frequently not expected by authors, and no simple side-effect-free way is available to 
control it. 

Float containment 
CSS does not explicitly offer any property that would force an element to contain floats. 
Multiple properties offer this functionality as a side effect, but none of them are 
completely appropriate in all situations. Generally, position: relative does solve this, but 
floats should be used and tested carefully. 

Lack of multiple backgrounds per element 
Highly graphical designs require several background images for every element, and CSS 
can support only one. Therefore, developers have to choose between adding redundant 
wrappers around document elements, or dropping the visual effect. This is partially 
addressed in the working draft of the CSS3 backgrounds module [6], which is already 
supported in Safari and Konqueror. 

Control of XHTML Element Shapes 
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CSS currently only offers box shapes, that means rectangles and 90 degree angles. 
Everyone attempting rounded corners or other shapes must resort to non-semantic 
XHTML markup. 

Standard Ordering of Declarations and Property Declarations 
Current CSS accepts property declarations in any order, though usually the last 
occurrence of a property declaration for one selector will take precedence. Also 
ordering of property declarations such as size, location, url, which side of the element, 
etc... can lead to confusion. 

Lack of Variables 
CSS contains no variables. Variables would allow naming colors or entire sets of 
declarations thus enabling re-use and reducing file size. Currently, you must make a 
comma separated list of selectors to apply the same declarations to multiple selectors. It 
could be easier to define a list of selectors and define a list of declarations. Variables 
could also make colors listed in declarations more human-readable than hexadecimal or 
RGB values, thus speeding CSS development time. 

Weaknesses in Pseudo-Classes and Pseudo-Elements 
These have vague spheres of existence and perhaps could be elevated to classes or 
demoted to properties. Sometimes the order of the appearance in the style sheet 
determines functionality implicitly. 

Lack of column declaration 
While possible in current CSS, layouts with multiple columns can be complex to 
implement. With the current CSS, the process is often done using floating elements 
which are often rendered differently by different browsers, different computer screen 
shapes, and different screen ratios set on standard monitors. A column declaration, if 
added to CSS, would fix these issues. 


See also 


e Comparison of layout engines (CSS) 
e Tableless web design 


Tableless web design 


Tableless Web design is a method of web design and development without using HTML 
tables for page layout control purposes. Instead of HTML tables, style sheet languages such as 
CSS (Cascading Style Sheets) are used to arrange elements and text on a web page. CSS was 
introduced by the World Wide Web Consortium (W3C) to improve web accessibility and to 
make HTML code semantic rather than presentational. 

Early in its advent, many web developers considered CSS a more powerful and easier to use 
way of formatting, and considered the <FONT> tag obsolete. On the other hand early CSS had 
very limited and difficult to use layout capabilities and many common page layouts (like the 3- 
column design) that were very easy to implement with tables had no obvious equivalents in CSS. 

As of 2006, the situation improved considerably. However, many popular browsers have 
limited or buggy support for these newer features of CSS, which has slowed the adoption of 
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tableless web design. Many websites still use CSS for text formatting only, while using tables for 
layout. 


Rationale 


HTML was originally designed as a semantic markup language intended for sharing 
scientific documents and research papers online. However, as the Internet expanded from the 
academic and research world into the mainstream in the mid 1990s, and became more media 
oriented, graphic designers sought for ways to control the visual appearance of the Web pages 
presented to end users. To this end, tables and spacers (usually transparent single pixel GIF 
images with explicitly specified width and height) have been used to create and maintain page 
layout. 

This causes a number of problems. Many Web pages have been designed with tables nested 
within tables, resulting in large HTML documents which use more bandwidth than documents 
with simpler formatting. Furthermore, when a table based layout is linearized, for example when 
being parsed by a screen reader or a search engine, the resulting order of the content can be 
somewhat jumbled and confusing. 

As a consequence of this trend, CSS was developed to improve the separation between 
design and content, and move back towards a semantic organization of content on the Web. 

In addition, a web browser usually has to download all content within a table before 
displaying it on a page, resulting in slower-seeming load times. Without tables, content on a page 
can load sequentially, appearing faster to the end user. 


Advantages 
Accessibility 


Main article: Web accessibility 

Because of the rapid growth of the Internet, disability discrimination legislation, and the 
increasing use of mobile phones and PDAs, it is necessary for Web content to be made 
accessible to users operating a wide variety of devices. Tableless Web design considerably 
improves Web accessibility in this respect. Screen readers and braille devices have fewer 
problems with tableless designs because they follow a logical structure. 

As a result of the separation of design (CSS) and structure (HTML), it is also possible to 
provide different layouts for different devices, e.g. handhelds, mobile phones, etc. It is also 
possible to specify a different style sheet for print, e.g. to hide or modify the appearance of 
advertisements or navigation elements that are irrelevant and a nuisance in the printable version 
of the page. 

The W3C's Web Content Accessibility Guidelines’ guideline no. 3 states "use markup and 
style sheets and do so properly." The guideline's checkpoint 3.3, a priority-2 checkpoint, says 
"use style sheets to control layout and presentation." 


Bandwidth savings 
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In general, a tableless design results in smaller, more compact Web pages. Tableless designs 
do away with the need for embedded tables and spacer images and can significantly reduce 
download times, save bandwidth and reduce the latency of the site. 

The CSS file can also be cached by the browser, thereby providing further savings in 
bandwidth, particularly for dynamically generated content as the layout information does not 
need to be reloaded with every page visit. 

These issues are particularly important on busy and popular sites, and on content that will be 
viewed by users with slow Internet connections (e.g. on mobile devices.) 


Maintainability 


Maintaining a website may require frequent changes, both small and large, to the visual style 
of a website, depending on the purpose of the site. Under table-based layout, the layout is part of 
the HTML itself. As such, without the aid of template-based visual editors such as HTML 
editors, changing the positional layout of elements on a global scale may require a great deal of 
effort, depending on the amount of repetitive changes required. Even employing grep or similar 
global find & replace utilities cannot alleviate the problem entirely. 

In tableless layout using CSS, virtually all of the layout information resides in one place: the 
CSS document. Because the layout information is centralized, these changes can be made 
quickly and globally by default. The HTML files themselves do not, usually, need to be adjusted 
when making layout changes. If they do, it is usually to add class-tags to specific markup 
elements or to change the grouping of various sections with respect to one another. 

Also, because the layout information is stored externally to the HTML, it is quite easy to add 
new content in a tableless design, whether modifying an existing page or adding a new page. By 
contrast, without such a design, the layout for each page may require a more time-consuming 
manual changing of each instance or global find & replace utilities. 

Lastly, if the HTML source code is to be edited by hand, the code is more visually readable 
without the table layout tags and other styling information. 


Problems 


The main problem with tableless design is the wide differences that are to be found in 
browser support. There are considerable differences in implementing a CSS layout for multiple 
browsers due to bugs and mis-interpretation of the standards by different browser developers. 
These necessitate a large number of complex hacks and workarounds in the CSS files, and Web 
pages need to be tested much more carefully on a wider variety of devices than with table-based 
design, as some of these bugs can render the content illegible on some browsers. 

In addition, CSS support in some older browsers such as Netscape 4 is very incomplete, 
which can cause major problems if these browsers also need to be targeted. 

Conversion to tableless web design has been slow also because of table to layer/css 
conversion software. HTML editors such as Adobe Dreamweaver can convert tables to layers 
back and forth. Though this would ease the conversion a little, complications exist in the 
exactness of the conversion. The centering of tables centered them on the page, but the centering 
of layers together on different screen resolutions requires some tinkering. 
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Stigmatizing the use of tables creates a situation where people fail to use tables when tables 
are appropriate. Using divisions to simulate a table for the display of tabular data is as much a 
design flaw as using tables to simulate a division. Some consider the term "tableless design" 
antiquated as the goal is to use the appropriate design tool for a task. 


Comparison of layout engines (CSS) 


The following tables compare CSS compatibility and support for a number of layout engines. 
Please see the individual products’ articles for further information. This article is not all-inclusive 
or necessarily up-to-date. Unless otherwise specified in footnotes, comparisons are based on the 
stable versions without any add-ons, extensions or external programs. 


Engine Nomenclature 


e Trident is the layout engine used in Internet Explorer for Windows. 
Tasman is used in Internet Explorer for Macintosh. 
Gecko in all Mozilla software. 
WebCore provides the HTML layout engine for Mac OS X. 
KHTML for KDE. 
Presto for Opera. 
iCab is a browser for the Mac. 


Meaning of Values 


"Yes" indicates that the current version of the browser fully supports this property when valid 
values are used. "No" indicates that this property is completely ignored. "Partial" indicates that 
the property is understood, but that not all values are supported. Supported values are 
implemented correctly. "Buggy" indicates that the property is understood, but that it is not 
implemented correctly in all cases. "Dangerous" indicates that the property is understood, but 
that deleterious behaviour results. Version numbers, where known, are provided to indicate when 
support for this feature was added. If support was added in a publicly available unofficial release 
(such as an alpha or beta version), a red background is used. Proprietary properties are not 
included. 


General overview 


See the article Cascading Style Sheets for more information on the different versions of CSS. 
This table doesn't reference CSS2.0, as CSS2.1 was intended to replace it by correcting or 
removing a few errors and features rejected by the CSS community. Most of the removed CSS2 
features will be made obsolete by CSS3. 

Trident Tasman Gecko WebCore KHTML PrestoiCab 
CSS1 6.0 0 10 8685 Yes 7.0 Yes 
CSS2.1Partial Partial MostlyMostly Mostly 9.0 Mostly 
CSS3 No No Slight Slight Slight Slight Slight 
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Grammar and rules 


Tasma Geck WebCor KHTM Prest iCa 


Trident 
1) 
important Velsht = 7g 0 1.0 
increasing 
* * 
; comment Comments 3.0 0 1.0 
: Import 
Srimport stylesheet ae 
@charset Character set 5.5 ? 1.0 
CSS media Mega: 5.5 0.9 1.0 
2 specific rules 
For — paged 
@page eaedin 5.5 No No 
Namespac 
oo . No Ne 
CSS3 declaration 
@font-face wipe 5.3 No 
font 


Trident grammar and rule notes 


e 
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85 


L 


No 


oO 


7.0 


7.0 


7.0 
7.0 
70 


70 


Yes 


No 


b 
Yes 


Yes 


Yes 
Yes 
3.0 


No 


8.0 


oZ FOZ 


1. @font-face — Only supports Embedded OpenType (.eot) font format, does not 


understand the format() identifier. 


2. ‘important — ‘important doesn't override rules defined later in the same 


declaration block prior to IE7. 


Tasman grammar and rule notes 


1. @font-face — Versions 5.16 and 5.17 will download fonts specified but not use 


them. 5.23 no longer downloads the font. 


Presto grammar and rule notes 


1. @import — Whilst Gecko, WebCore and iCab download all media stylesheets 
immediately, Opera only downloads handheld, print, projection and screen 
media, as well as speech if "voice" feature is enabled and tv, on tv devices. Text 
browser emulation mode is only a user stylesheet, so it does not switch to tty 
media type. This is consistent with older text browsers, which do not respect 


any CSS. 


Selectors 
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Trident selector notes 


:active — :active is for anchor element only. 

shover — :hover is applied for all elements only in standards-compliant mode, 
not quirks mode [1]. Prior to 7.0, :hover is for anchor element only. 

.one.two — only .two class selector was taken into consideration before 7.0. 

* — Prior to 7.0 this was treated as a single or no element. 


Tasman selector notes 


1. 
2s 


-hover — For anchor element only prior to 0.9. 
.one.two — only .two class selector is taken into consideration. 


WebCore selector notes 


1. :lang() — Only detected when explicitly present on element being tested, 
attribute not inherited. 

2. :target — Style doesn't get applied when navigating using back and forward 
buttons. 

3. :last-child & :last-of-type — will match all siblings, or all siblings of the same 
type, respectively. 

4. :only-child & :only-of-type — identical behaviour to :first-child & :first-of-type. 

5. :pseudo-element — Mistakenly also triggers on ::pseudo-class prior to version 
416.0. 

Properties 


Trident property notes 


1. 
Ze 
3. 


eo OND 


overflow — overflow:visible is incorrectly supported prior to IE7. 

border-style — dotted is rendered as dashed prior to IE7. 

display — Only none, block, inline, table-header-group, and table-footer-group are 
supported prior to 7.0. In 5.5+, inline-block is supported on elements that are 
naturally inline. 

position — Prior to IE 7, fixed positioning was not supported, IE 7 supports it in 
standards-compliant mode only. 

background-attachment — Prior to IE 7, fixed was allowed on the body element 
only. 

background-position — Fixed positioning is not supported. 

font-weight — Incorrect rendering when value is 600 [2]. 

text-decoration — Optional property blink is not supported. 

white-space — nowrap is supported in IE 5.5 and later. pre is supported in IE 6 
and later in standards-compliant mode. pre-line and pre-wrap are not supported. 
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list-style-type — decimal-leading-zero, lower-greek, lower-latin, upper-latin, upper- 
roman are not supported. 


Gecko property notes 


1. 


0 


display — run-in [3] is not supported. inline-table [4] and inline-block [5] are 
supported in Gecko 1.9a2+. 

quotes — Does not support nested quotes prior to 1.8 [6]. 

content — Only works with :before and :after. Only support image <uri>. 
<counter> does not work prior to 1.8 [7]. 

background-position — Gecko versions prior to 1.7 implement the CSS2 syntax, 
not the proposed CSS2.1 expanded syntax. 

font-size-adjust — Supported in Windows only[8]. This problem will be fixed in 
the future Gecko 1.9. 

font-weight — Only Regular and Bold weights get used, even if Light or 
Heavy/Black faces are installed. 

white-space — pre-line is not supported. pre-wrap is only supported 
experimentally as -moz-pre-wrap;. 

visibility — collapse is not supported prior to 1.8. 

border-radius — border curves are circular, but the W3C specs [9] define them 
as elliptical. Dashed curves are not available. Short-cut definitions for border- 
radius read "tl tr br bl" instead of the W3C's "tr br bl tl". Interesting bugs: [10] 


[44] [12] 
10. page-break-before; page-break-after — Only the always and auto values are 
supported. 
WebCore property notes 
1. box-shadow — The system font keywords are not supported that allow 


designers to tailor presentation to the user’s operating system environment. 
font — The system font keywords are not supported that allow designers to 
tailor presentation to the user’s operating system environment. 

font-weight — Only Regular and Bold weights get used, even if Light or 
Heavy/Black faces are installed. 


page-break-before; page-break-after — Only the always and auto values are 
supported. 
text-decoration — Optional property blink is not supported. 


white-space — pre-line and pre-wrap from the proposed CSS2.1 draft are not 
supported. 

text-shadow — Multiple shadows are not supported. 

visibility — collapse is not supported prior to 522. 


KHTML property notes 
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overflow — Values scroll and auto are unsupported. 

page-break-before; page-break-after — Before 3.5 only the always and auto 
values were supported. 

text-decoration — Optional property blink is not supported. 

visibility — All supported except for collapse. 


Presto property notes 


iM 


list-style-type — The CSS2 values are not supported prior to 8.0. 
counter-increment, counter-reset — Implemented the algorithm in REC CSS2. 
background-position — Presto versions prior to 8.0 implement the CSS2 syntax, 
not the proposed CSS2.1 expanded syntax. 

font-weight — Incorrect rendering when value is 600 [13]. 


5. white-space — pre-line is not supported. 
6. visibility — All supported except for collapse. 
iCab property notes 
1. display — run-in is not supported. 
2. text-align — justify is not supported. 
3. font-stretch — Simulated using letter-spacing property. 
4. font-weight — Incorrect (bold) rendering when value is 500. 
5. visibility — collapse only partial. 
6. caption-side — top and bottom only which would comply with the proposed 


7. 


CSS2.1 elimination of the values: left and right. 
empty-cells — Compresses hidden rows but does not hide them completely. 


Values and units 


Trident value and unit notes 


L, 


auto — In JE5 bug emulation mode (quirks mode) does not work for margins, 
except table elements. 


WebCore value and unit notes 


1. 


<percentage> — Only uses the integer component of <number>. 


Presto value and unit notes 


1. 


attr() — As all URI attribute values are resolved to full URI, the value could be 
wrong if attr() is used on an attribute that contains relative URI. 


. <number> — Quantization error for values greater than 20.47 [15] (not limited 


to em, try any non pixel units). 
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KHTML value and unit notes 


1. rgba() — Because opacity is not supported, the opacity value of rgba() is ignored. 


General notes 


Gecko general notes 


1. -moz- — All experimental selectors, properties and values are prefixed with "- 
moz-", e.g. ::-moz-selection instead of ::selection [16]. 


WebCore general notes 


1. -webkit- — All experimental selectors, properties and values are prefixed with "- 
webkit-", e.g. -webkit-box-shadow instead of box-shadow. 


KHTML general notes 


1. -khtml- — All experimental selectors, properties and values are prefixed with "- 
khtm1-", e.g. -khtml-opacity instead of opacity. 


Presto general notes 


1. -xv- — All new selectors, properties and values introduced by CSS3 Speech 
Module are prefixed with "-xv-" (but not found in CSS2 aural style sheets), e.g. - 
xv-voice-rate instead of voice-rate. 
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Printer friendly 


Printer friendly is a term used on the Internet to describe a version of a web page formatted 
for printing. Normally, web pages include information and navigation content that is only useful 
while browsing the website. Printer friendly pages are designed to fit on an 8.5"x11" or A4 sheet 
(see paper sizes) and include only the content of the page, along with source information. 
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Printer friendly pages are usually accessed through a link on the standard page. 
Wikipedia has a link to printer friendly versions of articles in the toolbox menu to the left. 


Designing printer-friendly pages with CSS 


Web page designers can use CSS2 media types to create one set of styles for display on a 
computer screen, and another for printing. For example, navigation areas might be set as 
display:none in the print stylesheet to remove them from the printed output, and sans-serif fonts 
replaced by serif fonts for ease of reading. This technique eliminates the need for a separate 
printer-friendly page to be maintained. 


<link rel="stylesheet" type="text/css" href="screen.css" media="screen" /> <link 
rel="stylesheet" type="text/css" href="printing.css" media="print" /> 


Brochureware 


A brochureware website is a business website that has very infrequently updated content, 
and little of it. Often the site has been developed as a direct translation of existing printed 
promotional materials, hence the name. 

Brochureware sites therefore take little advantage of the capabilities of the web that are 
missing in printed publication. Often the only hyperlinks on the site are in the site's navigation 
menu. 

Brochureware sites can be produced using WYSIWYG web authoring software, and 
consequently are often authored by people with little or no technical knowledge. Major elements 
of pages in a brochureware site are often delivered as inline images, even when these elements 
can be rendered perfectly well as text. This is very poor accessibility practice, as well as having a 
negative impact on the site's search engine rankings. 

Other common traits of brochureware sites include fixed width layouts, stock photographs of 
smiling models, and excessive use of information-free content delivered by web browser plugins 
such as Adobe Flash. 


Digital strategy 


In the fields of strategic management, marketing strategy, and operational strategy, digital 
strategy is the process of specifying an organization's vision, initiatives and processes in order to 
deploy their online assets (as of 2007, these include: web sites, mini-sites, mobile sites, digital 
audio and video content, rich Internet applications, community groups, banner ads, search engine 
marketing, affiliate programs, etc.) in a manner which maximizes the business benefits they 
provide to the organization. 


Overview 


There are numerous approaches to conducting digital strategy, but at their core, all go 
through three stages: identifying the key opportunities and/or challenges in a business where 
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online assets can provide a solution; identifying the unmet needs and goals of the customers that 
most closely align with those key business opportunities and/or challenges;'"! and developing a 
vision around how the online assets will fulfill those business and customer needs, goals, 
opportunities and challenges ! and prioritizing a set of online initiatives which can deliver on 
this vision. Within each of those stages, a number of techniques and analyses may be employed. 


Identifying the key opportunities and/or challenges in a business 


e Stakeholder interviews 
Includes one-on-one interviews, group interviews and workshops with a company's senior 
management, marketing and sales, operations and service stakeholders with a goal of 
understanding the business strategy, challenges and opportunities, products, organization, 
processes, supply chain and vendors, distributors, customers, and competitive landscape, as well 
as the potential role of their online assets. 
e Competitive analysis 
Includes evaluations of a companies main competitors and potential substitutes with the goal 
of understanding a companies strengths and weaknesses relative to their competitors and 
potential substitutes. While this often includes steps found in traditional marketing competitive 
analysis, such as products, prices, etc. Competitive analysis for Digital Strategy includes two 
unique items: 
e Heuristic evaluation 
An evaluation by a usability expert of the usability and user experience of a company's 
online assets compared and contrasted to those of it competitors and potential 
substitutes.0! 
e Features/functionality analysis 
An evaluation of the features and functionality provided by a company's online assets, 
compared and contrasted to those of its competitors and potential substitutes. 
e Financial analysis 
An analysis of a companies financial data (which may include everything from public 
financial statements to private ERP data) with the goal of understanding the financial impact 
(positive and negative) that certain changes would have on a company. 


Identifying the unmet needs and goals of the customers 


e Customer interviews 
Includes one-on-one interviews and focus groups with a company's customers with a goal of 

understanding customers behaviors, needs, goals and perceptions of the company and their 
industry both in the broadest business context as well as specifically online. In addition to 
standard marketing strategy methodologies and questions, customer interviews for Digital 
Strategy may includes usability testing, an analysis of how effective customers are at using the 
online assets developed by a company for their intended purposes. In digital strategy this is used 
to uncover usability barriers in the present state that may prevent the accomplishment to the 
online vision. 

e Ethnographic research 
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An analysis of customer behaviors in their environment, for example: field observations of 
shoppers at a store. In addition to standard ethnographic research, digital strategy may include 
video taping of a customer using their computers or specific computer applications or web sites. 

e Web analytics 

An analysis of the usage patters of a company's online assets with the goal of better 
understanding customer behavior as well as identifying strengths and weakness of the company's 
current online offerings. This may include understanding how many people are visiting a web 
site, what are the most popular pages, what are the most popular paths, where are people coming 
from, where do they drop off, how long do they stay, etc. 

e Funnel analysis 

A specific methodology for web analytics where the company's online assets are modeled as 
a sales funnel, with a visit or impression representing a new leads, a certain page or action in the 
web site considered a conversion (such as a user hitting the purchase confirmation page) and 
specific pages in the web site representing specific stages of the sales funnel. The goal of the 
analysis is to provide insight into the overall conversion rate as well as the key weak points of 
the funnel (the stages in which the largest percentages of users drop out of the funnel)."! 

e Analytical CRM 

An analysis of a company's customer databases and information repositories with the goal of 
segmenting customers into homogeneous groups across one or more dimension of behavior, 
demographics, value, product or marketing message affinity, etc. In digital strategy this often 
includes the online customer registration database which companies use to provide access to 
their customer specific, protected areas. 

e Multi-channel analysis 

An analysis of a customers behavior (such as their purchase behavior or their service 
behavior) that looks across all the different channels in which customers interact with a 
company's products or information. There are lots of different ways to do this, be a 
representative example would be, a company focuses on the customer purchase process (how a 
customer becomes aware of a product, how a customer develops the intent to purchase a product, 
and how a customer actually purchases the product). The analysis would look at which channels 
(example: phone, catalog, retail store, web site, 3rd party search engine, etc.) a customer uses at 
which stage of the purchase process, attempts to understand why each channel is used, and 
evaluates the company's strengths or weaknesses in that particular channel for that particular 
stage of the process." 

e Statistical surveys 

An approach to collection customer feedback in a quantitative manner from a large 
population. In digital strategy, surveys may be used to validate or invalidate key questions raised 
in more qualitative exercises such as customer interviews and focus group. Depending on the 
breadth of the survey population and the degree of variation within the population, survey results 
may be segmented to form homogeneous groups across one or more dimension of behavior, 
demographics, value, product or marketing message affinity, etc. Surveys are often conducted 
online using web intercepts, e-mail lists, or 3rd party panels, although phone surveys or other 
offline means may sometimes be used when they are questions as to the online savvy-ness of a 
particular target population. 
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Developing a vision and prioritizing a set of online initiatives 


e Business plan or case 
A spreadsheet with supporting documentation that quantifies the investments and returns 
over time that will result from the execution of the online strategy. The Business plan also 
defines the Key Performance Indicators (KPIs) that will be used to measure and evaluate the 
success of the online strategy. 
e Technical assessment 
A design of a technical architecture which will meet the needs of the business vision and 
conform to the business plan and roadmap. This is often done as a gap analysis where the current 
technical architecture is assessed. A future technical architecture, which meets the needs of the 
online vision, is designed. The gaps between the current state and future state are identified, and 
a Series of initiatives or projects to fill those gaps are developed and sequenced. 
e Organizational and process assessment 
Similar to a technical assessment, organizational and process assessments look at the changes 
that need to be made to an organization and its processes in order to achieve the online vision. 
They may involve a series of business process reengineering projects focused on the areas of an 
organization most affected by the online initiatives. 
e Portfolio management 
A way of prioritizing various initiatives by comparing their cost of implementation to their 
expected business benefits. This is often done by creating a two by two matrix where cost of 
implementation runs along the x-axis (from high cost to low cost) and expected business benefit 
runs along the y-axis, from low benefit to high benefit. Individual initiatives or projects are then 
plotted on the matrix in terms of their calculated costs and benefits and priorities are determined 
according to which projects will provide the greatest benefit for the least amount of cost. 
e Online media plan 
A plan detailing the allocation of media spending across online media (as of 2007, this 
includes: search engine marketing, banner advertising, and online affiliate networks) usually as 
part of the customer acquisition or retention portions of the digital strategy. 
e Proof of concept 
Graphics representations or comps of key ideas or processes of the digital strategy. These are 
often created in order to better communicate a key concept or to build excitement among 
stakeholders when building consensus or socializing a digital strategy. 
e Roadmap 
A high-level project plan which details the durations and dependencies of all the initiatives in 
the digital strategy. The roadmap will often include checkpoints to assess the progress and 
success of the digital strategy. 
e Measurement plan 
A description of the key performance indicators used to measure the effectiveness of the 
digital strategy as well as the process for collecting and sharing the information. The 
measurement plan usually covers the financial, operational, and e-business metrics and their 
relationships. !*I!7! 
e Governance model 
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The organizational structure, roles, and process description of the operational entity the will 
manage the initiatives in a digital strategy. The governance model describes who is responsible 
for what, how decisions are made, how issues are escalated, and how information on the 
performance of the projects is communicated within the organization. 


Role of personas in digital strategy 


As of 2007, a trend in digital strategy is the use of personas as a framework for using 
customer information to prioritize online initiatives. Personas are character sketches which 
represent a typical member of one customer segment and highlights their needs, goals and 
behaviors. Because it is representative of a customer segment, it allows decision makers to 
prioritize various features based on the needs of the segment. Because it is a character sketch, it 
is sometimes easier for decision makers to internalize the key needs of the segment than it would 
be by reading reams and reams of data. A typical approach is to create the segment based on 
customer analysis such as customer interviews, ethnographic research, and statistical surveys. 
Then assemble key decision makers or stakeholders, present the findings of the personas, and use 
them to kick start a brainstorming session around different online initiatives which can meet the 
personas needs and goals.'*! 


Execution of a digital strategy 


Historically, execution of a business or digital strategy is done as a big bang, with large 
initiatives such as site redesigns and transactional systems taking 6-12 months to develop and 
often an additional 6-12 months before they delivery any results. As of 2007, a trend has 
emerged where companies adopt a more iterative approach to rolling out their strategies, one 
which leverages a series of smaller tests, which are carefully measured and analyzed and used to 
modify or optimize the digital strategy. An example of this test-measure-optimize-scale approach 
is that a company might take some key pages on their site and test a number of versions of those 
pages with different marketing messages, design approaches, user experience optimizations, 
navigation optimizations, and even new features and functions using a multivariate or A/B test. 
The company would then identify the page which had the best combination of changes in terms 
of some key business metric (such as conversion), analyzing the results to understand which 
changes where most instrumental in affecting the high conversion rate, and applying those 
learnings to future pages and future tests. The advantage of this approach is that in the long run, 
it tends to be more successful in delivering business results, because each step is measured and 
adjusted for. In addition, it tends to favor smaller (less risky, less expensive) steps rather than 
larger (more risky, more expensive) initiatives before getting the payback.[9] The disadvantage 
is that over time this approach tends to converge on a solution (local optimum), not necessarily 
the best solution (global optimum) that might have been reached if a company starts from scratch 
instead of building each step on the previous one. Another disadvantage is that although this 
solution tends to favor smaller, more incremental changes, there is often a larger up front cost to 
setting up all the measurement systems and staffing a company with the right analysts and 
change processes to react to these tests in a timely and effective manner. As a result, companies 
often adopt a mix of big bang efforts augmented by some smaller, more iterative efforts as part 
of their overall strategy. 
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Digital strategy vs. online strategy 


As of 2007, these two terms tend to be thrown out somewhat interchangeably. However, 
there is beginning to emerge consensus around the differences between digital strategy and 
online strategy. Digital strategy is beginning to refer to the strategy a company takes to become a 
digital company, where digital connotes deeper interactions with their customers, more 
customized and personalized offerings and interactions, data driven decision making, and an 
organizational models and processes which are more nimble and reactive to changes in the 
company's environment. When digital strategy is the imperative, companies tend to look to well 
regarded digital companies such as Google, Amazon, and eBay as their aspirational models. In 
this context, a company may use the term online strategy to be limited to the development of 
plans to deploy their online assets to maximize business results and digital strategy to be the 
more transformative step of changing the organization. 


DOM scripting 


The term DOM Scripting refers to programmatically accessing the Document Object Model 
(DOM). In common usage, DOM Scripting usually implies the use of JavaScript, but could 
easily be accomplished by any other programming or scripting language with access to the 
DOM. DOM Scripting has its roots in DHTML, but is far more structured. It is the third pillar in 
the web standards movement. 


DOM Scripting vs. DHTML 


Traditional DHTML 
e uses JavaScript, CSS and HTML 
e tends to be browser-specific 
e usually requires code forking 
e web only 
e was used with non-standardized markup 
DOM Scripting 
e uses ascripting language and valid markup (HTML/XHTML/XML/etc.) 
e can be used with any valid document type 
e can be used with any programming/scripting language 
e is browser independent 
e degrades gracefully 


Books 


e Jonathan Snook, Aaron Gustafson, Stuart Langridge and Dan Webb: Pro DOM 
Scripting with Ajax, APIs and Libraries, ISBN 1-59059-764-8 

e Jeffrey Sambells and Aaron Gustafson: AdvancED DOM Scripting: Dynamic Web 
Design Techniques, ISBN 1-59059-856-3 
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e Christian Heilmann: Beginning JavaScript with Dom Scripting and Ajax, ISBN 1- 
59059-680-3 

e Stuart Langridge: DHTML Utopia - Modern Web Design Using JavaScript and 
DOM, ISBN 0-9579218-9-6 

e Jeremy Keith: DOM Scripting - Web Design with Javascript and the Document 
Object Model, ISBN 1-59059-533-5 


Fahrner Image Replacement 


Fahrner Image Replacement (abbreviated FIR) is a Web design technique that uses 
Cascading Style Sheets to replace text on a Web page with an image containing that text. It is 
intended to keep the page accessible to users of screen readers, text-only web browsers, or other 
browsers where support for images or style sheets is either disabled or nonexistent, while 
allowing the image to differ between styles. FIR is named for Todd Fahrner, one of the persons 
originally credited with the idea of image replacement.""! 


Motivation 


The typical method of inserting an image in an HTML document is via the <img> tag. This 
method has its drawbacks with regards to accessibility and flexibility, however: 

e Browsers without support for images cannot see the text. Adding alternative 
text using the alt attribute disallows HTML markup and causes problems with 
some search robots. 

e Using the <img> tag to show text is presentational; many Web designers argue 
that presentational elements should be separated from HTML content by 
placing the former in a CSS style sheet. 

e Images referenced using an <img> tag cannot be easily changed via CSS, causing 
problems with alternate stylesheets. 

Fahrner Image Replacement was devised to rectify these issues. 


Implementations 


The original FIR implementation"! described by Douglas Bowman used a heading, inside of 
which was a <span> element containing the text of the heading: 


<h3 id="firHeader"><span>Sample Headline</span></h3> 


Through style sheets, the heading was then given a background containing the desired image, 
and the <span> hidden by setting its display CSS property to none: 


#firHeader { width: 300px; height: 50px; background: #fff url(firHeader.gif) top left no-repeat; } 
#firHeader span { display: none; } 
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It was soon discovered, however, that this method caused some screen readers to skip over 
the heading entirely, as they would not read any text that had a display property of none. The 
later Phark method, developed by Mike Rundle, instead used the text-indent property to push 
the text out of the image's area, addressing this issue: 


#firHeader { width: 300px; height: 50px; text-indent: -5000px; } 


The Phark method had its own problems, however; in visual browsers where CSS was on but 
images off, nothing would display. 

Dave Shea's eponymous Shea method solves both of the issues earlier mentioned, at the cost 
of an extra <span>: 


<h3 id="header"><span></span>Revised Image Replacement</h3> 


By absolutely positioning an empty <span> over the text element, the text is effectively 
hidden. If the image fails to load, the text behind it is still displayed. For this reason, images with 
transparency cannot be used with the Shea method. 


#header { width: 329px; height: 25px; position: relative; } #header span { background: 
url(firHeader.gif) no-repeat; position: absolute; width: 100%; height: 100%; } 


Microformat 


Microformat logo 


In web development, a microformat (sometimes abbreviated %4F or uF) is a way of adding 
simple semantic meaning to human-readable content which is otherwise, from a machine's point 
of view, just plain text. They allow data items such as events, contact details or locations, on 
HTML (or XHTML) web pages, to be meaningfully detected and the information in them to be 
extracted by software, and indexed, searched for, saved or cross-referenced, so that it can be 
reused or combined. 

More technically, they are items of semantic mark up, using just standard (X)HTML with a 
set of common class-names. They are open and available, freely, for anyone to use. 

Current microformats allow the encoding and extraction of events, contact information, 
social relationships, and so on. More are being developed. Version 3 of the Firefox browser[1], 
as well as version 8 of Internet Explorer’! are expected to include native support for 
microformats. 


Principles 
(X)HTML standards allow for semantics to be embedded and encoded within them. This is 


done using specific HTML attributes: 
e class 
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e rev 

For example, 52.48,-1.89 is a pair of numbers which may be understood, from their context, 
to be a set of geographic coordinates. By wrapping them in spans (or other HTML elements) 
with specific class names (in this case part of the geo microformat specification): 


<span class="geo"><span class="latitude">52.48</span>, <span class="longitude">- 
1.89</span></span> 


machines can be told exactly what each value represents, and can then index it, look it up on 
a map, export it to a GPS device, etc. 


Example 
Consider the contact information: 


<div> <div>Joe Doe</div> <div>The Example Company</div> <div>604-555-1234</div> <a 
href="http://example.com/">http://example.com/</a> </div> 


With hCard microformat markup, that becomes: 


<div class="vcard"> <div class="fn">Joe Doe</div> <div class="org">The Example 
Company</div> <div class="tel">604-555-1234</div> <a class="url" 
href="http://example.com/">http://example.com/</a> </div> 


Here the formal name (fn), organisation (org), telephone number (fe/) and url have been 
identified using specific class names; and the whole thing is wrapped in class="vcard", (see 
vCard) which indicates that the other classes form a vcard, and are not just coincidentally named. 
Other, optional, hCard classes also exist. 

It is now possible for software, for example browser plug-ins, to extract the information, and 
transfer it to other applications, such as an address book. 


Live example 

For a live example of microformats on Wikipedia pages, see HCard#Live example 
Accessibility of Microformats 

Because some microformats make use of the abbr html element in the "abbr design pattern" 


to conceal machine-only data in the expansion of the abbreviation, the data is inaccessible to 
some screenreaders that expand abbreviations. This is particularly the case with date-time and 
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geographical data. See the Web Standards Project's Accessibility Task Force article 
"hAccessibility”. 


Specific microformats 


Several microformats have been developed to enable semantic markup of particular types of 
information. 

e hAtom (hAtom spec) - for marking up Atom feeds from within standard HTML 

e hCalendar (hCalendar spec) - for events 

e hCard (hCard spec) - for contact information; includes: 
o adr (adr spec) - for postal addresses 
© geo (geo spec) - for geographical coordinates (latitude;longitude) 

e hReview (hReview spec) - for reviews 

e hResume (hResume spec) - for resumes or CVs 

e rel-directory (rel-directory spec) - for distributed directory creation and 
inclusion 

e rel-nofollow, an attempt to discourage 3rd party content spam (e.g. Spam in 
blogs). 

e rel-tag (rel-tag spec) - for decentralized tagging (Folksonomy) 

e xFolk (xFolk spec) - for tagged links 

e XFN - for social relationships 

e XOXO - for lists and outlines 


Proposed microformats 


Among the many proposed microformats"!, the following are undergoing active 
development: 
e citation - for citing referencest*! 
e currency - for amounts of moneys! 
e Geo extension - for places on Mars, The Moon, and other such bodies 
e species - For the names of living things. 


Uses of microformats 


Using microformats within HTML code provides additional formatting and semantic data 
that can be used by applications. These could be applications that collect data about on-line 
resources, such as web crawlers, or desktop applications such as e-mail clients or scheduling 
software. 

Several browser extensions, such as Operator, provide the ability to detect microformats 
within an HTML document and export them into formats compatible with contact management 
and calendar utilities, such as Microsoft Outlook. 

Microsoft expressed a desire to incorporate Microformats into upcoming projects"; as have 
other software companies. 
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In Wikipedia - and more generally in MediaWiki - microformats are used as part of templates 
like { {coord} } for example. 


Creation of microformats 


Most of the existing microformats were created at the microformat wiki and associated 
mailing list, by a process of gathering examples of web publishing behaviour, then codifying it. 
Some other microformats (such as rel=nofollow and unAPI have been proposed, or developed, 
elsewhere. 


Critique of microformats 


Commentators have offered various critiques of microformats, suggesting viewpoints where 
they are both potentially favorable and potentially unfavorable."”! 


Favorable aspects 


e Microformats are designed for usability and backward-compatibility. 

e An alternative approach, using XML (for semantics) plus CSS (for presentation), 
is not as flexible with older web browsers.'*! 

e Microformats are no more subject to abuse and spam than any other similar 
proposal for data normalization. 


Unfavorable aspects 


e Microformats present an undesirable mixing of data-layer with presentation- 
layer. (91 

e Accessibility is hindered by the use of abbr. 0% 

e The use of CSS class names to specify the semantics of markup is sub-optimal, 
and would be better expressed as XML elements. (1181 

e Microformats present yet another medium for abuse and spam within HTML 
markup. 


Further reading 


e Allsopp, John (2007). Microformats: Empowering Your Markup for Web 2.0. 
Friends of ED. ISBN 978-1-59059-814-6. 

e Orchard, Leslie M. (2005). Hacking RSS and Atom. Wiley. ISBN 978-0-7645-9758-9. 

e Robbins, Jennifer Niederst (2006). Web Design In A Nutshell, Third Edition, O'Reilly 
Media. ISBN 978-0-596-00987-8. 

e Suda, Brian (2006). Using Microformats. O'Reilly Media. ISBN 978-0-596-52817-1. 

e Haine, Paul (2006). HTML Mastery. Friends of ED, pp. 117-154. ISBN 978-1- 
59059-765-1. 
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Progressive enhancement 


Progressive enhancement is a label for a particular strategy of Web design that emphasizes 
accessibility, semantic markup, and external stylesheet and scripting technologies, in a layered 
fashion that allows everyone to access the basic content and functionality of a Web page, using 
any browser or Internet connection, while also enabling those with better bandwidth or more 
advanced browser software to experience an enhanced version of the page. 


History 


"Progressive Enhancement" was coined by Steven Champeon, of Web design firm 
hesketh.com, in a series of articles and presentations for Webmonkey and the Sxsw Interactive 
conference between March and June of 2003.!I?! 


Introduction and background 


The strategy is an attempt to subvert the traditional Web design strategy known as "graceful 
degradation", wherein designers would attempt to create Web pages for the latest browsers that 
would also work well in older versions of browser software. Graceful degradation was supposed 
to allow the page to "degrade", or remain presentable even if certain technologies assumed by the 
design were not present, without being jarring to the user of such older software (hence 
"gracefully"). In practice, graceful degradation has been supplanted by an attitude that the end 
user should "just upgrade". This attitude is due to time and budget constraints, limited access to 
testing alternate browser software, as well as the widespread belief that "browsers are free". 
Unfortunately, upgrading is often not possible due to IT department policies, older hardware, and 
other reasons. The "just upgrade" attitude also ignores deliberate end user choices and the 
existence of a variety of browser platforms; many of which run on handhelds or in other contexts 
where available bandwidth is paltry, or where support for sound or color, limited screen size, and 
so forth are far different from the typical graphical desktop browser. 

In Progressive Enhancement (PE) the strategy is deliberately reversed: a basic markup 
document is created, geared towards the lowest common denominator of browser software 
functionality, and then the designer adds in functionality or enhancements to the presentation and 
behavior of the page, using modern technologies such as Cascading Style Sheets or JavaScript 
(or other advanced technologies, such as Flash or Java applets or SVG, etc.) All such 
enhancements are to be externally linked, in order to avoid forcing browsers of lesser capability 
to "eat" data they do not understand and cannot handle, or which would swamp their Internet 
connection. 

The PE approach is derived from Champeon's early experience (c. 1993-4) with SGML, 
before working with HTML or any Web presentation languages, as well as from later 
experiences working with CSS to work around browser bugs. In those early SGML contexts, 
semantic markup was of key importance, whereas presentation was nearly always considered 
separately, rather than being embedded in the markup itself. This concept is variously referred to 
in markup circles as the rule of separation of presentation and content, separation of content and 
style, or of separation of semantics and presentation. As the Web evolved in the mid-nineties, 
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but before CSS was introduced and widely supported, this cardinal rule of SGML was repeatedly 
violated by HTML's extenders. As a result, web designers were forced to adopt new, disruptive 
technologies and tags in order to remain relevant. With a nod to graceful degradation, in 
recognition that not everyone had the latest browser, many began to simply adopt design 
practices and technologies only supported in the most recent and perhaps the single previous 
major browser releases. For several years, much of the Web simply did not work in anything but 
the most recent, most popular browsers. This remained true until the rise and widespread 
adoption of and support for CSS, as well as many populist, grassroots educational efforts (from 
Eric Costello, Owen Briggs, Dave Shea, and others) showing Web designers how to use CSS for 
layout purposes. 

PE is based on a recognition that the core assumption behind "graceful degradation" — that 
browsers always got faster and more powerful — was proving itself false with the rise of 
handheld and PDA devices with low-functionality browsers and serious bandwidth constraints. 
In addition, the rapid evolution of HTML and related technologies in the early days of the Web 
has slowed, and very old browsers have become obsolete, freeing designers to use powerful 
technologies such as CSS to manage all presentation tasks and JavaScript to enhance complex 
client-side behavior. 

First proposed as a somewhat less unwieldy catchall phrase to describe the delicate art of 
"separating document structure and contents from semantics, presentation, and behavior", and 
based on the then-common use of CSS hacks to work around rendering bugs in specific 
browsers, the PE strategy has taken on a life of its own as new designers have embraced the idea 
and extended and revised the approach. 


Core principles 


Progressive Enhancement consists of the following core principles: 
e all basic content should be accessible to all browsers 
e all basic functionality should be accessible to all browsers 
e sparse, semantic markup contains all content 
e enhanced layout is provided by externally linked CSS 
e enhanced behavior is provided by unobtrusive, externally linked JavaScript 
e end user browser preferences are respected 


Support and adoption 


Jim Wilkinson created a page for Progressive Enhancement Wiki to collect some tricks and 
tips and to explain the overall strategy."! Designers such as Jeremy Keith have shown how the 
approach can be used harmoniously with still other approaches to modern Web design (such as 
Ajax) to provide flexible, but powerful, user experiences." Others, including Dave Shea, have 
helped to spread the adoption of the term to refer to CSS-based design strategies. Organizations 
such as the Web Standards Project have embraced PE as a basis for their educational efforts. In 
2006 Nate Koechley at Yahoo! mades extensive reference to PE in his own approach to Web 
design and browser support, Graded Browser Support (GBS).[5] Steve Chipman at AOL has 
referred to PE as a basis for his Web design strategy.[6] David Artz, friend of Steve and leader of 
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the mighty AOL Optimization team, developed a suite of Accessible Rendering Technologies - 
experiences based on these principles - and invented a technique for disassembly of the 
"enhancement" on the fly, saving the user's preference. Chris Heilmann discusses the importance 
of targeted delivery of CSS so that each browser only gets the content (and enhancements) it can 
handle.) Many Web design agencies have begun to advertise that they provide progressive 
enhancement as a core service. 


Benefits for accessibility 


Web pages created according to the principles of PE are by their nature more accessible, 
because the strategy demands that basic content always be available, not obstructed by 
commonly unsupported or easily disabled scripting. Additionally, the sparse markup principle 
makes it easier for tools that read content aloud to find that content. It is unclear as to how well 
PE sites work with older tools designed to deal with table layouts, "tag soup," and the like. 


Benefits for search engine optimization (SEO) 


Improved results with respect to Search Engine Optimization is another side effect of a PE- 
based Web design strategy. Because the basic content is always accessible, and the markup is 
clean and easily parsed for structure and intent, it becomes much easier to tune the content to 
improve SEO results. 


Criticism and responses 


Some skeptics, such as Garret Dimon, have expressed their concern that PE is not workable 
in situations that rely heavily on JavaScript to achieve certain user interface presentations or 
behaviors.'! Jeremy Keith is to present Hijax: Progressive Enhancement with Ajax at XTech06, 
suggesting that the two are compatible. Others have countered with the point that informational 
pages should be coded using PE in order to be indexed by spiders, and that even Flash-heavy 
pages should be coded using PE. In a related area, many have expressed their doubts concerning 
the principle of the separation of content and presentation in absolute terms, pushing instead for a 
realistic recognition that the two are (and some would say should be) inextricably linked. 


See also 

e Ajax 

e Liquid layout 
Rollover 


Rollover refers to a button as created by a web developer or web designer, found within a 
web page, used to provide interactivity between the user and the page itself. The term rollover in 
this regard originates from the visual process of "rolling the mouse cursor over the button" 
causing the button to react (usually visually, by replacing the button's source image with another 
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image), and sometimes resulting in a change in the web page itself. The part of the term ‘roll’ is 
probably referring to older mouses that had a mechanical assembly consisting of a hard rubber 
ball housed in the base of the mouse (which rolls) as opposed to the modern optical mouse, 
which has no ‘rolling’ parts. The term mouseover is probably a more appropriate term 
considering today's technology. 

Rollovers can be done by imagery, text or buttons. The user only requires 2 images/buttons 
(with the possible addition of "alt" text to these images) to perform this interactive action. 
Rollover imagery can be done either by a program with a built-in tool or script coding. The user 
will have to pick a first image and select an alternate secondary image. A mouse action will have 
to be set to either "click on" or "mouse over" in order for the rollover to be triggered. Note that 
when the "mouse over" moves on the image, the alt image/secondary image will appear but won't 
stay - when the user "mouses out" by moving the mouse away from the image, the original 
source image will reappear. 


Coding 


There are different ways unto creating a rollover, while software and programs as a tool 
already prepared to make it easier to create a rollover. This is an example of a rollover in CSS 
and HTML: 

CSS 


a { display: block; width: 100px; height: 30px; background-image: url(default.png); } a:hover { 
background-image: url(rollover.png); } a span { display: none; } 


HTML 


<a href="http://en.wikipedia.org/"><span>Wikipedia</span></a> 


Different types of rollovers 


While rollovers are not in themselves animated images, some users and HTML experts have 
managed to create animation-like effects. 

e Zooming rollovers: when the mouse is moved over an image/text or button, it 
increases its size, depending on the limit size the user sets. link 

e Fading rollovers: when the user moves the mouse over an image/text or button, 
it either fades in or out, depending on the user control link 

e Disjointed rollovers: when the mouse is moved over an image or button, other 
ares on the screen change to reflect what will happen if the user clicks. 


Spacer GIF 


Prior to the adoption of Cascading Style Sheets (CSS), the spacer GIF was a transparent 
image, often used to control blank space within a web page, that can be resized according to the 
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width and height dimensions it is given. Spacer GIFs are not browser specific. The reason a 
spacer GIF is invisible is so that an HTML developer can create a table cell and fill the 
background with a specific color that can be viewed through the transparent spacer GIF. For 
instance, let's say the developer wants to create a blue box that is 500 x 500 pixels wide. Instead 
of using a separate blue graphic that is 500 x 500 pixels (taking up additional bandwidth), the 
developer can specify the table cell background color and specify the dimensions of the spacer 
GIF that already exists. 

To create a spacer GIF, simply open a new Adobe Photoshop, Macromedia Fireworks, GIMP 
(or whatever graphics software you have) file with a transparent background and with the 
dimensions | pixel by | pixel. You only need it to be 1 pixel as you will specify the height and 
width in the HTML. Normally you would not do this with a GIF image but since the spacer GIF 
is transparent, stretching it will not matter. Save the image as a transparent GIF named 
"spacer.gif". 

The use of spacer GIFs has declined due to the prevalence of CSS for laying out web pages, 
which achieves the same effect by changing the margin or padding on a given element. CSS is 
preferable as it, if used properly, reduces unnecessary code in a web page. Blank 1x1 GIFs are 
still occasionally used to fix a PNG rendering bug in Microsoft Internet Explorer versions 5.5 - 6. 


History 


David Siegel's 1996 book "Creating Killer Web Sites" was allegedly the first to publish the 
Spacer GIF technique. According to David Siegel himself, he invented the trick in his living 
room, while others were probably inventing the same trick at around the same time. 

According to a blog comment by Joe Kleinberg, he actually invented the technique earlier. 
However, it has not been possible to confirm further, with screenshots, code or similar. 


Techniques for creating a User Centered Design 


Techniques for creating a web based User-Centered Design 

A User-centered design (UCD) is a philosophy and a process. It is a philosophy that places 
the person (as opposed to the 'thing') at the center; it is a process that focuses on cognitive factors 
(such as perception, memory, learning, problem-solving, etc.) as they come into play during 
peoples' interactions with things. UCD seeks to answer questions about users and their tasks and 
goals, and then use the findings to drive development and design (Katz-Haas). An effective UCD 
is essential for any high functioning web page. The importance of a UCD lies within its ability to 
communicate with the user in a way that allows a thorough and simple presentation of the sites 
content and navigation conventions. This article will cover six principle conventions of a UCD 
that will work in conjunction with each other to create an effective UCD. 

Navigation Aids 

The first principle that all Web design should understand when they are creating an effective 
web based UCD is navigation aids that are clear and consistent. All users are expected to 
navigate through cyber space via Hyperlink. Lynch and Horton, authors of The Web Style Guide 
believe that the most problematic design issue when it comes to navigation aids is users who lack 
a sense of where they are within the hierarchy of a site. To counteract lost visitors, The style 
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guide suggests that the designer can use clear, consistent icons, graphics, and identity schemes 
(Lynch and Horton 2002). Professor James R. Grunwald suggests that if a web site is going to be 
useful, it must be easy to navigate. Navigation occurs at two levels; within a particular web site 
and between web sites. Menus are often used to aid navigation within a web site (Grunwald 
1997). The site design should always allow users to easily return to the home page and other 
upper level pages. The links that accomplish this should be placed in a consistent position on 
every page. 

Simplicity and Consistency 

The second principle is in direct relation to the first, and it deals with simple, consistent, 
conventional design. Users do not appreciate an over-designed site. The average website should 
follow all conventions of layout and navigation because users are already familiar with them. A 
website should be consistent and predictable (Lynch and Horton 2002). For maximum 
functionality and legibility, your page and site design should be built on a consistent pattern of 
modular units that all share the same basic layout grids, graphic themes, editorial conventions, 
and hierarchies of organization (Lynch and Horton 2002). To be more specific, Grunwald 
suggests that designers keep page lengths short, use appropriate text fonts and styles, and use 
color appropriately (Grunwald 1997). These are all important concepts to keep in mind when 
building a UCD because certain colors and fonts have specific rhetorical effects on an audience 
and a long page length is an immediate deterrent. Conventional design lends the user some 
automatic familiarity which will directly affect whether or not the user has a good feeling about 
the website. 

Dead Links 

The third principle to keep in mind deals with dead links. Designer should always be aware 
of all links to, from, and within the site. Most web pages do not begin with a preface explaining 
where, within the hierarchy, a page lies (Lynch and Horton 2002). So, it is important that if users 
are directed directly to a subsection of a website that they are not restricted to that page and 
others lower in the hierarchical scheme. Designers should place working links on every page that 
allow the users to go to the homepage and other main pages within the site. Grunwald points out 
that points out that one way to enhance the navigability within a site is to include navigation aids 
such as "return to home page", "previous page", and "next page” links on each web page. This 
not only increases the navigability within your site, but allows users who enter your site on a 
page other than your home page, to easily find their way around. While you don't have control 
over other sites, you do have control over the links from your sites to other sites. (Grunwald 
1997) 

Direct Access to Information 

The fourth principle that is important to an intelligent UCD is obtaining the sought after 
information in the fewest number of click. Direct access requires the designer to create a very 
efficient hierarchy of content in order to decrease the number of steps it takes a user to find what 
they are looking for. Studies have shown that users prefer menus that present 5 to 7 links and that 
they prefer a few very dense screens of choices to many layers of simplified menus. The Design 
hierarchy should allow for real content to be only a click or two away from the site’s main pages 
(Lynch and Horton 2002). A UCD needs to communicate clearly with the intended user. In order 
to communicate clearly, a user interface should be well organized and structured. (Grunwald 
1997) Should the user have trouble finding the information they are looking for in a reasonable 
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amount of time, the particular site they are visiting will forever have made an unfavorable first 
impression. 

Visual Confirmation 

The fifth principle involved with creating an effective web based UCD accomplishes visual 
confirmation of the user’s location within a site (Lynch and Horton 2002). Whether it is through 
links, titles and headings or a breadcrumb trail, the user should always have visual feedback from 
the site as to their position within the site. This gives the user a frame from which to work from. 
Knowing a relative or concrete position within a site lets the user feel comfortable and open to 
the message of the site, it also lends ethos to the site. The Web site designer should also allow for 
dialog with the users. When the design process begins, the web site designer should have a good 
idea of the intended audience, but through dialog with the user, the designer can make helpful 
changes to the UCD based on the information gathered from and about the sites users. Providing 
an email address and other contact information lets the users establish contact with the 
webmaster (Lynch and Horton 2002). 

Connection Speed 

The sixth and final principle needed in order to create an effective web based UCD is a 
technical issue. Not everyone accesses the web via a high speed connection. Designers should be 
aware of this and construct a site that can be handled by a slower internet connection. All users 
get frustrated when they have to wait on a page to load. Web page designs that are not well 
suited to connection speed of the average user will probably lose traffic. Research has shown that 
for most computing tasks the threshold of frustration is about ten seconds (Lynch and Horton 
2002). Designers should be conservative with the number of large graphics because not all users 
have a high speed connection, and although some do, they still value a fast loading page. 
All these principles working in conjunction with one another should allow for an effective web 
based UCD. These principles create design integrity and functional stability (Lynch and Horton 
2002). An effective UCD is the first step in securing high traffic and a profitable return on 
investment. Obviously there are many more UCD techniques not discussed here, but these six 
should get any designer off in the right track. 


URL redirection 


URL redirection, also called URL forwarding, domain redirection and domain 
forwarding, is a technique on the World Wide Web for making a web page available under 
many URLs. 


Purposes 


There are several reasons for a webmaster to use redirection: 
Similar domain names 


Users might search for the same information under slightly different URLs, e.g. gooogle.com 
and googel.com. An organization can register these domains and re-direct them to the correct 
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location: google.com . Alternatively, an organization can register these domains and re-direct 
them to its own website, thus catching the traffic of careless typers. 


Moving a site to a new domain 


A Web site might change its domain name for several reasons. An author might move his or 
her pages to a new domain or two sites might merge. With URL redirects, incoming links to the 
old URL can be directed to the new location. These links might be from other sites that have not 
realized that there is a change or from bookmarks/favorites that users have saved in their 
browsers. 

The same applies to search engines. They have the older domain in their database and will 
link visitors to the URLs found previously. By using a "moved permanently" redirect to the new 
URL, visitors will still end at the correct page. Also, in the next crawl, the search engine should 
detect and use the newer URL. 


Load balancing 


Redirects issued by the server or a redirect page are sometimes used to distribute requests to 
reduce bandwidth usage, the redirects usually being rotated between the main site and site 
mirrors. 


Logging outgoing links 


The access logs of most web servers keep detailed information from where visitors came and 
how they browsed the hosted site. They do not, however, log which links visitors left by. This is 
because the visitor's browser has no need to communicate with the original server when the 
visitor clicks on an out-going link. 

This information can be captured in several ways. One way involves URL redirection. 
Instead of sending the visitor straight to the other site, links on the site can direct toa URL on the 
original website's domain that automatically redirects to the real target. This added request will 
leave a trace in the server logs saying exactly which link was followed. This technique is also 
used by some corporate websites to have a "warning" page that the content is off-site and not 
necessarily affiliated with the corporation. This technique does bear the downside in the delay of 
an additional request to the original website's server. For websites that wish to display a 
"warning" page before automatically forwarding, the length of time the warning is displayed is 
an additional delay. 


Short, meaningful, persistent aliases for long or changing URLs 


Currently, web engineers tend to pass descriptive attributes in the URL to represent data 
hierarchies, command structures, transaction paths and session information. This results in a 
URL that is aesthetically unpleasant and difficult to remember. Sometimes the URL of a page 
changes even though the content stays the same. 
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Manipulating search engines 


Some years ago, redirect techniques were used to fool search engines. For example, one page 
could show popular search terms to search engines but redirect the visitors to a different target 
page. There are also cases where redirects have been used to "steal" the page rank of one popular 
page and use it for a different page, usually involving the 302 HTTP status code of "moved 
temporarily." 

Search engine providers noticed the problem and took appropriate actions. Usually, sites that 
employ such techniques to manipulate search engines are punished automatically by reducing 
their ranking or by excluding them from the search index. 

As a result, today, such manipulations usually result in less rather than more site exposure. 


Satire and criticism 

In the same way that a Google bomb can be used for satire and political criticism, a domain 
name that conveys one meaning can be redirected to any other web page, sometimes with 
malicious intent. 


Manipulating visitors 


URL redirection is sometimes used as a part of phishing attacks that confuse visitors about 
which web site they are visiting. 


Techniques 


There are several techniques to implement a redirect. In many cases, Refresh meta tag is the 
simplest one. However, there exist several strong opinions discouraging this method. 


Manual redirect 


The simplest technique is to ask the visitor to follow a link to the new page: 
Please follow <a href="http://www.example.com/">link</a>! 
This method is often used as a fallback for one of the following methods: If the visitor's 


browser does not support the automatic redirect method, the visitor can still reach the target 
document by clicking on the link. 


HTTP status codes 3xx 


In the HTTP computer protocol used by the World Wide Web, a redirect is a response with a 
status code beginning with 3 that induces a browser to go to another location. 
The HTTP standard defines several status codes for redirection: 
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e 300 multiple choices (e.g. offer different languages) 
e 301 moved permanently 
e 302 found (e.g. temporary redirect) 
e 303 see other (e.g. for results of cgi-scripts) 
e 307 temporary redirect 
All of these status codes require that the URL of the redirect target is given in the Location: 
header of the HTTP response. The 300 multiple choices will usually list all choices in the body 
of the message and show the default choice in the Location: header. 
Within the 3xx range, there are also some status codes that are quite different from the above 
redirects (they are not discussed here with their details): 
e 304 not modified 
e 305 use proxy 
e 306 not used 
This is a sample of a HTTP response that uses the 301 "moved permanently" redirect: 


HTTP/1.1 301 moved permanently Location: http://www.example.org/ Content-type: text/html 
Content-length: 78 Please follow <a href="http://www.example.org/">link</a>! 


Using server side scripting for Redirection 


Often, web authors don't have sufficient permissions to produce these status codes: The 
HTTP header is generated by the web server program and not read from the file for that URL. 
Even for CGI scripts, the web server usually generates the status code automatically and allows 
custom headers to be added by the script. To produce HTTP status codes with cgi-scripts, one 
needs to enable non-parsed-headers. 

Sometimes, it is sufficient to print the "Location: 'url' header line from a normal CGI script. 
Many web servers choose one of the 3xx status codes for such replies. 

The HTTP protocol requires that the redirect be sent all by itself, without any web page data. 
As a result, the web programmer who is using a scripting language to redirect the user's browser 
to another page must ensure that the redirect is the first or only part of the response. In the ASP 
scripting language, this can also be accomplished using the methods response.buffer=true and 
response.redirect "http://www.example.com". Using PHP, one can use header("Location: 
http://www.example.com");. 

According to the HTTP standard, the Location header must contain an absolute URI. When 
redirecting from one page to another within the same site, it is a common mistake to use a 
relative URI. As a result most browsers tolerate relative URIs in the Location header, but some 
browsers display a warning to the end user. 


Using .htaccess for Redirection 


When using Apache web server directory-specific .htaccess file can be used. 
To Move a single page: 
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Redirect 301 /oldpage.html http://www.example.com/newpage.html 


To Change domain names: 


RewriteEngine On RewriteCond %{HTTP_HOST} “.*oldwebsite\.comS [NC] RewriteRule 4(.*)S 
http://www.preferredwebsite.net/$1 [R=301,L] 


This method usually does not require Admin permissions. 
Refresh Meta tag and HTTP refresh header 


Netscape introduced a feature to refresh the displayed page after a certain amount of time. 
This method is often called meta refresh. It is possible to specify the URL of the new page, thus 
replacing one page after some time by another page: 

e HTML <meta> tag 
e Anexploration of dynamic documents 
e Proprietary extensions 
A timeout of 0 seconds means an immediate redirect. 
This is an example of a simple HTML document that uses this technique: 


<html><head> <meta http-equiv="Refresh" content="0; url=http://www.example.com/"> 
</head><body> Please follow <a href="http://www.example.com/">link</a>! </body></html> 


e This technique is usable by all web authors because the meta tag is contained 
inside the document itself. 

e The meta tag must be placed in the "head" section of the html file. 

e Note the strange syntax of the content field. 

e The number "0" in this example may be replaced by another number to achieve 
a delay of as many seconds. 

e Many users regard a delay of this kind as annoying unless there is a reason for 
it. 

e This is a proprietary/non-standard extension by Netscape. It is supported by 
most web browsers. 

This is an example of achieving the same effect by issuing a HTTP refresh header: 


HTTP/1.1 200 ok Refresh: 0; url=http://www.example.com/ Content-type: text/html! Content- 
length: 78 Please follow <a href="http://www.example.com/">link</a>! 


This response is easier to generate by CGI programs because one does not need to change the 
default status code. Here is a simple CGI program that affects this redirect: 
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#!/usr/bin/perl print "Refresh: 0; url=http://www.example.com/\r\n"; print "Content-type: 
text/html\r\n"; print "\r\n"; print "Please follow <a 
href=\"http://www.example.com/\">link</a>!" 


Note: Usually, the HTTP server adds the status line and the Content-length header 
automatically. 

This method is considered by the W3C to be a poor method of redirection, since it does not 
communicate any information about either the original or new resource, to the browser (or search 
engine). The W3C's Web Content Accessibility Guidelines (7.4) discourage the creation of auto- 
refreshing pages, since most web browsers do not allow the user to disable or control the refresh 
rate. Some articles that they have written on the issue include W3C Web Content Accessibility 
Guidelines (1.0): Ensure user control of time-sensitive content changes and Use standard 
redirects: don't break the back button! 


JavaScript redirects 


JavaScript offers several ways to display a different page in the current browser window. 
Quite frequently, they are used for a redirect. However, there are several reasons to prefer HTTP 
header or the refresh meta tag (whenever it is possible) over JavaScript redirects: 

e There are several reasons for some users to disable JavaScript: 

o Security considerations 
o Some browsers don't support JavaScript 
o many crawlers (e.g. from search engines) don't execute JavaScript. 

e There is no "standard" way of doing it: A search for "you are being redirected" 
will find that virtually each JavaScript redirect employs different methods. This 
makes it difficult for Web client programmers to honor your redirect request 
without implementing all of JavaScript. 


Frame redirects 


A slightly different effect can be achieved by creating a single HTML frame that contains the 
target page: 


<frameset rows="100%"> <frame src="http://www.example.com/"> </frameset> <noframes> 
Please follow <a href="http://www.example.com/">link</a>! </noframes> 


One main difference to the above redirect methods is that for a frame redirect, the browser 
displays the URL of the frame document and not the URL of the target page in the URL bar. 

This technique is commonly called cloaking. This may be used so that the reader sees a more 
memorable URL or, with fraudulent intentions, to conceal a phishing site as part of website 
spoofing." 
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Redirect loops 


It is quite possible that one redirect leads to another redirect. For example, the URL 
http://www.wikipedia.com/wiki/URL_redirection (note the differences in the domain name) is 
first redirected to http://www. wikipedia.org/wiki/URL_redirection and again redirected to the 
correct URL: http://en.wikipedia.org/wiki/URL_redirection. This is appropriate: the first 
redirection corrects the wrong domain name. The second redirection selects the correct language 
section. Finally, the browser displays the correct page. 

Sometimes, however, a mistake can cause the redirection to point back to the first page, 
leading to an infinite loop of redirects. Browsers usually break that loop after a few steps and 
display an error message instead. 

The HTTP standard states: 

A client SHOULD detect infinite redirection loops, since such loops generate network 

traffic for each redirection. 

Previous versions of this specification recommended a maximum of five redirections; some 
clients may exist that implement such a fixed limitation. 


Services 


There exist services that can perform URL redirection on demand, with no need for technical 
work or access to the webserver your site is hosted on. 


URL redirection services 


URL redirection services exist to shorten long URLs. 

Some web publishers have criticized the use of these services, arguing that replacing a URL 
with an encoded shortcut effectively erases information from a document. For instance, a 
redirected URL may link to a blacklisted site. 

Hyperlinks involving URL redirection services are frequently used in spam messages 
directed at blogs and wikis. Thus, one way to reduce spam is to reject all edits and comments 
containing hyperlinks to known URL redirection services; however, this will also remove 
legitimate edits and comments and may not be an effective method to reduce spam. 


URL obfuscation services 


There exist redirection services for hiding the referrer using META refresh. 
This is very easy to do with PHP, such as in this example. 


<?php /* This code is placed into the public domain */ /* Will redirect a URL */ Su=$_GET['url']; 
?> <html> <head><title>Redirect</title> <meta http-equiv="refresh" content="0; URL=<?php 
echo(Su); ?>"> </head> <body> You should be able to be redirected to <a href="<?php 
echo(Su); ?>"><?php echo(Su); ?></a>. </body></html> 
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See also 


For URL redirection on Wikipedia, see Wikipedia: Redirect. 
e Link rot 


Web Interoperability 


Web Interoperability means producing web pages viewable in standard compatible web 
browsers, various operating systems such as Windows, Macintosh and Linux and devices such as 
PC, PDA and mobile phone based on the latest web standards. 


History 


This term was originated by the Web Interoperability Pledge that is a promise to adhere to 
current HTML Recommendations as promulgated by the World Wide Web Consortium (W3C). 
The WIP was not a W3C initiative. but it was started by and has been run by ZDNet AnchorDesk 
quite independently. 

This issue was known by cross browsing in browser war between Internet Explorer and 
Netscape. The Microsoft Internet Explorer was exclusive in browser market after that, but 
modern web browsers such as Mozilla Firefox, Opera and Safari have supported web standards. 
Because of backward compatibility of Internet Explorer, many web pages has supported non- 
standard HTML tags and DOM handling script yet as well platform-defendant techniques such 
as ActiveX. These are very harmful for Web accessibility and Device Independence. 


Elements of Web Interoperability 


e Structural and semantic markup with XHTML. 

e CSS based layout with layout elements such as position and float. 

e Separating among structure, presentation and behavior in web pages. 
e DOM scripting based on W3C DOM Standard and ECMAScript. 


Activities 

It has been various activities, for example Web Standards Project, Mozilla's Technology 
Evangelism and Web Standards Group. Also there are educational activities such as Web 
Essential Conference. 
See also 


e Web accessibility 


Web modeling 
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Web modeling (aka model-driven Web development) is a branch of Web engineering 
which addresses the specific issues related to design and development of large-scale Web 
applications. In particular, it focuses on the design notations and visual languages that can be 
used for the realization of robust, well-structured, usable and maintainable Web applications. 
Designing a data-intensive Web site amounts to specifying its characteristics in terms of various 
orthogonal abstractions. The main orthogonal models that are involved in complex Web 
application design are: data structure, content composition, navigation paths, and presentation 
model. 

Several languages and notations have been devised for Web application modeling. Among 
them, we can cite: 

e HDM -W2000 


e RMM 

e OOHDM 

e ARANEUS 
e STRUDEL 

e TIRAMISU 
e WebML 

e Hera 


e UML Web Application Extension 
e UML-based Web Engineering (UWE) 


e ACE 
e WebdArchitect 
e OO-H 


Web template 


A web template is an element of a web template system that is used to rapidly generate and 
mass-produce web pages through a pre-defined schematic, layout, or finite number of 
programming language instructions.[1] In its simplest sense, a web template operates similarly to 
a form letter.[2] One of the primary design principles used to justify the use of web templates is 
the "separation of presentation and content".!) 

Content (from a database), and "presentation specifications" (in a web template), are 
combined (through the template engine) to mass-produce web documents. 


Basic concepts 


A web template has certain basic defining characteristics. These characteristics can be 
described according to core principles. 
Among these core principles: 
e The principle of Effective separation; 
e The principle of Flexible presentation; and 
e The principle of Reusability. 
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Effective separation 


A common goal among experienced web developers is to develop and deploy applications 
that are flexible and easily maintainable. An important consideration in reaching this goal is the 
separation of business logic from presentation logic.[4] Developers use web template systems 
(with varying degrees of success) to maintain this separation." 

One difficulty in evaluating this separation is the lack of well-defined formalisms to measure 
when and how well it is actually met.) There are, however, fairly standard heuristics that have 
been borrowed from the domain of software engineering. These include ‘inheritance’ (based on 
principles of object-oriented programming); and the "Templating and Generative programming’, 
(consistent with the principles of MVC separation).! The precise difference between the various 
guidelines is subject to some debate, and some aspects of the different guidelines share a degree 
of similarity. 


Flexible presentation 


One major rationale behind "effective separation" is the need for maximum flexibility in the 
code and resources dedicated to the presentation logic.'! Client demands, changing customer 
preferences and desire to present a "fresh face" for pre-existing content often result in the need to 
dramatically modify the public appearance of web content while disrupting the underlying 
infrustructure as little as possible. 

The distinction between "presentation" (front end) and "business logic" (infrastructure) is 
usually an important one, because: 

e the presentation source code language may differ from other code assets; 

e the production process for the application may require the work to be done at 
separate times and locations; 

e different workers have different skillsets, and presentation skills do not always 
coincide with skills for coding business logic; 

e code assets are easier to maintain and more readable when disparate 
components are kept separate and loosely coupled;!5! 


Reusability 


Not all potential users of web templates have the willingness and ability to hire developers to 
design a system for their needs. Additionally, some may wish to use the web but have limited or 
no technical proficiency. For these reasons, a number of developers and vendors have released 
web templates specifically for reuse by non-technical people. Although web template reusability 
is also important for even highly-skilled and technically experienced developers, it is especially 
critical to those who rely on simplicity and "ready-made" web solutions. 

Such "ready-made" web templates are sometimes free, and easily made by an individual 
domestically. However, specialized web templates are sometimes sold online. Although there are 
numerous commercial sites that offer web templates for a licensing fee, there are also free and 
"open-source" sources as well. 
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Open source templates 


The rise of the open source design movement has seen a slow but steady rise in the 
community of open source designers. Some sites offer open source templates in addition to other 
content. 

Examples include: 

e www.oswd.org 
¢ www.openwebdesign.org 
e www.dotemplate.com 

As of this writing there are over 4000 unique templates available for modification and use by 

anyone under various open source licenses. 


System elements 


The following describes some of the common technical aspects associated with web 
templates. 


Sub-template 


A web template may include other templates as "sub-templates". Page headers and page 
footers, for use in a primary template, are typical uses for sub-templates. A sub-template is 
similar to a subroutine in imperative programming languages. The sub-template supplies a 
document fragment to the primary template, which is then sent to the final output destination. 

The specific functionality of sub-templates depends on the template language in use. For 
example, template code enclosed in sub-templates may not follow the same rules for variable 
interpolation and processing of template directives as code specified directly in a primary 
template. 


Notes and references 


1. % Suh, Woojong (2005). Web Engineering: Principles and Techniques. Idea Group 
Inc (IGT). 1591404339. 

2. * Chamberlain, Darren (2004). Perl Template Toolkit. O'Reilly. 0596004761. "The 
basic task of a template processor is to output some amount of changeable data 
surrounded by unchanging data." 

3. “ (Suh, p. 251) 

4, «be Parr, Terence John (2004). Enforcing strict model-view separation in template 
engines. Proceedings of the 13th international conference on World Wide Web. 1- 
58113-844-X. 


See also 


e CSS to complete the "presentation separation". 
e Content management system 
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Web-safe fonts 


Web safe fonts are the fonts guaranteed to be present on all or most computer systems, and 
hence can be relied on to display consistently on webpages. Typically, not having a particular 
font installed will cause reversion to some system default font, or, in the case of alternate 
character encodings, "garbage" or meaningless characters to be displayed. 


CSS Generic Font Families 


The most safe option is to use the generic font families as defined by the CSS. These are: 
sans-serif, serif, monospace, cursive and fantasy. One of them should always be specified as a 
last resort for every font-family CSS property. 


Microsoft's Core Fonts for the Web 


Microsoft has started a Core fonts for the Web initiative. However, despite the name, only 
some of the fonts defined there are web-safe indeed. These are: Arial, Courier New, Times New 
Roman, Verdana, Trebuchet MS (more recently) and Lucida Sans (to some extent). 


Web-Safe Unicode Fonts 


Only two fonts available by default on the Windows platform, Arial and Lucida Sans, 
provide a wide Unicode character repertoire. A bug in Verdana (and the different handling of it 
by various user agents) hinders its usability where combining characters are desired. 


Dingbats 


No dingbats font, including Webdings, can be considered web-safe. 


Website architecture 


Website architecture is an approach to the design and planning of websites which, like 
architecture itself involves technical, aesthetic and functional criteria. As in_ traditional 
architecture, the focus is properly on the user and on user requirements. This requires particular 
attention to web content, a business plan, usability, interaction design, information architecture 
and web design. For effective Search Engine Optimisation it is necessary to have an appreciation 
of how a single website relates to the World Wide Web. 

Since web content planning, design and management come within the scope of design 
methods, the traditional Vitruvian aims of Commodity, Firmness and Delight can guide the 
architecture of websites, as they do physical architecture and other design disciplines. Website 
architecture is coming within the scope of Aesthetics and Critical Theory and this trend may 
accelerate with the advent of the Semantic Web and Web 2.0. Both ideas emphasise the 
structural aspects of information. Structuralism is an approach to knowledge which has 
influenced a number of academic disciplines including aesthetics, critical theory and 
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postmodernism. Web 2.0, because it involves user-generated content, direct the website 
architect's attention to the structural aspects of information. 

"Website architecture" has the potential to be a term used for the intellectual discipline of 
organizing website content. "Web design", by way of contrast, describes the practical tasks, part- 
graphic and part-technical, of designing and publishing a website. The distinction compares to 
that between the task of editing a newspaper or magazine and its graphic design and printing. But 
the link between editorial and production activities is much closer for web publications than for 
print publications. 


See also 


e Website 
e Web design 


Website wireframe 


In web design, wireframes are a basic visual guide used to suggest the layout and placement 
of fundamental design elements in the interface design. Because of this they must be completed 
before any artwork is developed. When completed correctly they will provide a visual reference 
upon which to structure each page. They also allow for the development of variations of a layout 
to maintain design consistency throughout your site. This is an important part of the initial 
development stage because it creates user expectations and helps to develop an awareness and 
familiarity throughout the site. 

Creating a set of wireframes for any project also act as a communication tool to clients and 
stakeholders, such as content creators, engineers and developers. Over the course of a project the 
wireframing exercise functions as a stable base on which to consider changes, diverse user paths 
and new requirements. The information architect and information designer typically use the 
wireframes as a meeting of the minds, in terms of having solid working documents on which to 
establish the language, content and structure of interactions users will have with a given site or 
project. 

The creation of wireframes also helps to define the positioning of global and secondary levels 
of navigation in a prominent and intuitive position, as well as providing an area for utilities such 
as helpful information and search facilities. When creating your wireframes it is critical to ensure 
that branding is central to the identity of a site so as to communicate the personality of the site. 


Website Wireframing Tools 


Microsoft Visio (Windows) 

Intuitect Professional (Visio Add-on - Windows) 
Axure RP Pro (Windows) 

Omnigraffle (Mac) 

Adobe Photoshop (Mac & Windows) 

Adobe Fireworks (Mac & Windows) 
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Web development software 


Most web site builders are proprietary tools provided by web hosting companies which cater 
to people who wish to build their own websites without learning the technical aspects of web 
page production. The person who wishes to use the website builder typically signs up with the 
company -- most offer free trial periods -- and chooses the design that best suits his or her 
purpose. Some companies’ tools allow the user to see the source code, that is, the HTML behind 
the page he is building. Many do not, meaning that only certain designated areas on the page can 
be modified. Usually, these areas are: headers, text and some graphic elements. 

The site builders currently in use in 2007 are Micromedia Flash Professional, Photoshop 
Dreamweaver, Microsoft Frontpage, Yahoo! SiteBuilder or Macromedia ColdFusion. 
Micromedia Flash Professional is the industry's most advanced site building environment for 
creating interactive websites with digital and mobile content. Adobe Dreamweaver is the 
industry standard for advanced web publishing. Microsoft FrontPage delivers professional 
quality web sites but has complications with browsers other than Microsoft Internet Explorer. 
Yahoo! SiteBuilder is a basic site builder that comes free with the Yahoo! web hosting plans. At 
last, Macromedia ColdFusion is designed for advanced web applications (namely, forms Flash). 


See also 


e HTML editor 
Adobe Flash | Content management system | HTML editors | WAMP 


Adobe Flash 


Adobe Flash CS3 under Mac OS X. 


Developer: 


Adobe Systems 


Latest release: 


CS3 (9.0) / April 16, 2007 


OS: 


Windows (no native Windows XP Professional x64 Edition support), Mac OS X, 32-bit Intel 
compatible GNU/Linux and Solaris (Intel, SPARC) [1] 


Genre: 
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Multimedia Content Creator 


License: 


Proprietary EULA 


Website: 
Adobe.com's Flash page 


Adobe Flash, or simply Flash, refers to both the Adobe Flash Player, and to the Adobe Flash 
Professional multimedia authoring program. Adobe Flash Professional is used to create content 
for the Adobe Engagement Platform (such as web applications, games and movies, and content 
for mobile phones and other embedded devices). The Flash Player, developed and distributed by 
Adobe Systems (which acquired Macromedia in a merger that was finalized in December 2005), 
is a client application available in most common web browsers. It features support for vector and 
raster graphics, a scripting language called ActionScript and bi-directional streaming of audio 
and video. There are also versions of the Flash Player for mobile phones and other non-PC 
devices. 

Strictly speaking, Adobe Flash Professional is an integrated development environment (IDE) 
while Flash Player is a virtual machine used to run, or parse, the Flash files. But in contemporary 
colloquial terms "Flash" can refer to the authoring environment, the player, or the application 
files. 

Since its introduction in 1996, Flash technology has become a popular method for adding 
animation and interactivity to web pages; several software products, systems, and devices are 
able to create or display Flash. Flash is commonly used to create animation, advertisements, 
various web-page components, to integrate video into web pages, and more recently, to develop 
rich Internet applications. 

The Flash files, traditionally called "Flash movies" or "Flash games", have a .swf file 
extension and may be an object of a web page, strictly "played" in a standalone Flash Player, or 
incorporated into a Projector, a self-executing Flash movie with the .exe extension in Windows. 
Flash Video files have a -flv file extension and are utilized from within .swf files. 


History 


Flash grew out of a chain of thought that started in the 1980s with some ideas Jonathan Gay 
had at school, then at college and later while working for Silicon Beach Software and its 
successors.[1] In January 1993, Charlie Jackson, Jonathan Gay, and Michelle Welsh started a 
small software company called FutureWave Software and created their first product, 
SmartSketch. A drawing application, SmartSketch was designed to make creating computer 
graphics as simple as drawing on paper. Although SmartSketch was an innovative drawing 
application, it didn't gain enough of a foothold in its market. As the Internet began to thrive, 
FutureWave began to realize the potential for a vector-based web animation tool that might 
easily challenge Macromedia's often slow-to-download Shockwave technology. In 1995, 
FutureWave modified SmartSketch by adding frame-by-frame animation features and re-released 
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it as FutureSplash Animator on Macintosh and PC. By that time, the company had added a 
second programmer Robert Tatsumi, an artist Adam Grofcsik, and a PR specialist Ralph 
Mittman. The product was offered to Adobe and used by Microsoft in its early (MSN) work with 
the Internet. In December 1996, Macromedia acquired the vector-based animation software and 
later released it as Flash 1.0. 


Macromedia Flash 2 (1997) Features: Support of stereo sound, enhanced 
bitmap integration, buttons, the Library, and the capability to tween color 
changes. 

Macromedia Flash 3 (1998) Features: Brought improvements to animation, 
playback, and publishing, as well as the introduction of simple script commands 
for interactivity. Macromedia ships its 100,000th Flash product this year, as 
well. 

Macromedia Flash 4 (1999) Features: Achieved 100 million installations of the 
Flash Player, thanks in part to its inclusion with Microsoft Internet Explorer 5. 
Flash 4 saw the introduction of streaming MP3s and the Motion Tween. Initially, 
the Flash Player plug-in was not bundled with popular web browsers and users 
had to visit Macromedia website to download it, but as of year 2000, the Flash 
Player was already being distributed with all AOL, Netscape and Internet 
Explorer browsers. Two years later it shipped with all releases of Windows XP. 
The install-base of the Flash Player reached 92% of all Internet users. 
Macromedia Flash 5 (2000) Features: Flash 5 was a major leap forward in 
capability, with the evolution of Flash's scripting capabilities as released as 
ActionScript. Flash 5 also saw the ability to customize the authoring 
environment's interface. 

Macromedia Generator was the first initiative from Macromedia to separate 
design from content in Flash files. Generator 2.0 was released in April 2000 and 
featured real-time server-side generation of Flash content in its Enterprise 
Edition. Generator was discontinued in 2002 in favor of new technologies such 
as Flash Remoting, which allows for seamless transmission of data between the 
server and the client, and ColdFusion Server. 
In October 2000, usability guru Jakob Nielsen 
wrote a polemic article regarding usability of Flash content entitled "Flash 99% 
Bad". (Macromedia later hired Nielsen to help them improve Flash usability.) 

In September 2001, a survey made for Macromedia by Media Metrix showed 
that out of the 10 biggest websites in the United States, 7 were making use of 
Flash content. 

On March 15, 2002, Macromedia announced the availability of Macromedia 
Flash MX and Macromedia Flash Player 6, with support for video, application 
components, shared libraries, and accessibility. 

Flash Communication Server MX, also released in 2002, allowed video to be 
streamed to Flash Player 6 (otherwise the video could be embedded into the 
Flash movie). 

Flash MX 2004 was released in September 2003, with features such as faster 
runtime performance up to 8 times with the enhanced compiler and the new 


123 


NICOLAE SFETCU: WEB DESIGN & DEVELOPMENT 


Macromedia Flash Player 7, ability to create charts, graphs and additional text 
effects with the new support for extensions (sold separately), high fidelity 
import of PDF and Adobe Illustrator 10 files, mobile and device development 
and a forms-based development environment. ActionScript 2.0 was also 
introduced, giving developers a formal Object-Oriented approach to 
ActionScript. V2 Components replaced Flash MX's components, being rewritten 
from the ground up to take advantage of ActionScript 2.0 and Object-Oriented 
principles. Flash MX 2004 was the first release of Flash to be segmented into 
"Basic" and "Professional" versions. The Basic version was targeted at 
traditional Flash animators while the Professional version brought more 
advanced capabilities that developers would use, for example the data 
components. 

e In 2004, the "Flash Platform" was introduced. This expanded Flash to more than 
the Flash authoring tool. Flex 1.0 and Breeze 1.0 were released, both of which 
utilized the Flash Player as a delivery method but relied on tools other than the 
Flash authoring program to create Flash applications and presentations. Flash 
Lite 1.1 was also released, enabling mobile phones to play Flash content. 

e Macromedia Flash 8 (2005) is touted by Macromedia as the most significant 
upgrade to Flash since Flash 5. New features included filter effects and blending 
modes, bitmap caching, a new video codec called On2 VP6, an enhanced type 
rendering engine called FlashType, an emulator for mobile devices, and several 
enhancements to the ActionScript 2.0 spec, such as the BitmapData class, 
several geometric classes, and the ConvolutionFilter and DisplacmentMapFilter 
classes. 

e Flash Lite 2 was also released in 2005, which brought its capabilities in line 
with Flash Player 7. 

e On December 3, 2005, Adobe Systems acquired Macromedia and its product 
portfolio (including Flash). 

e Adobe Flash Player 9 was released for Windows and Mac OS in 2006, which 
marked the first time a Flash Player major release occurred without a 
simultaneous Flash authoring program major release. Flex 2.0 was released in 
conjunction with Flash Player 9, and the player will be continued when Flash 
Authoring 9 is released in 2007. For the first time in the history of Flash, the 
Flash Player will have had an opportunity to become widely installed before the 
release of the equivalent Flash program. 

e Adobe Flash Player 9 was released for Linux in January 2007.1! 

e Adobe Flash CS3 in 2007, originated from Flash 8 with several updates for 
integrating into other Adobe products, is released as a bundled software of the 
Adobe Creative Suite 3. This currently-newest version also brings ActionScript 
3.0 and a new xml engine to the Flash authoring tool. Also has an improved and 
optimized GUI like the rest of the CS3 suite. 


History (Authoring tool) 
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FutureSplash Animator (Spring 10 April 1996) - initial version of Flash with 
basic editing tools and a timeline 

Flash 1 (November 1996) - a Macromedia re-branded version of the 
FutureSplash Animator 

Flash 2 (June 1997) - Released with Flash Player 2, new features included: the 
object library 

Flash 3 (31 May 1998) - Released with Flash Player 3, new features included: 
the movieclip element, JavaScript plug-in integration, transparency and an 
external stand alone player 

Flash 4 (15 June 1999) - Released with Flash Player 4, new features included: 
internal variables, an input field, advanced Actionscript, and streaming MP3 
Flash 5 (24 August 2000) - Released with Flash Player 5, new features included: 
ActionScript 1.0 (based on ECMAScript, making it very similar to JavaScript in 
syntax), XML support, Smartclips (the precursor to components in Flash), HTML 
text formatting added for dynamic text 

Flash MX (ver 6) (15 March 2002) - Released with Flash Player 6, new features 
included: a video codec (Sorenson Spark), Unicode, v1 UI Components, 
compression, ActionScript vector drawing API 

Flash MX 2004 (ver 7) (9 September 2003) - Released with Flash Player 7, new 
features included: Actionscript 2.0 (which enabled an_ object-oriented 
programming model for Flash), behaviors, extensibility layer (JSAPI), alias text 
support, timeline effects 

Flash MX Professional 2004 (ver 7) (9 September 2003) - Released with Flash 
Player 7, new features included all Flash MX 2004 features plus: Screens (forms 
for non-linear state-based development and slides for organizing content in a 
linear slide format like PowerPoint), web services integration, video import 
wizard, Media Playback components (which encapsulate a complete MP3 
and/or FLV player in a component that may be placed in a SWF), Data 
components (DataSet, XMLConnector, WebServicesConnector, 
XUpdateResolver, etc) and data binding APIs, the Project Panel, v2 UI 
components, and Transition class libraries. 

Flash Basic 8 (released on 13 September 2005) - A less feature-rich version of 
the Flash authoring tool targeted at new users who only want to do basic 
drawing, animation and interactivity. Released with Flash Player 8, this version 
of the product has limited support for video and advanced graphical and 
animation effects. 

Flash Professional 8 (released on 13 September 2005) - Released with the 
Flash Player 8, Flash Professional 8 added features focused on expressiveness, 
quality, video, and mobile authoring. New features included Filters and blend 
modes, easing control for animation, enhanced stroke properties (caps and 
joins), object-based drawing mode, run-time bitmap caching, FlashType 
advanced anti-aliasing for text, On2 VP6 advanced video codec, support for 
alpha transparency in video, a stand-alone encoder and advanced video 
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importer, cue point support in FLV files, an advanced video playback 
component, and an interactive mobile device emulator. 

e Flash CS3 Professional (as version 9, released on 16 April 2007) - Flash CS3 is 
the first version of Flash released under the Adobe name. CS3 features full 
support for ActionScript 3.0, allows tweens to be converted into ActionScript, 
adds better integration with other Adobe products such as Adobe Photoshop, 
and also provides better Vector drawing behavior, becoming more similar to 
Adobe Illustrator and Adobe Fireworks. 


Future developments 


Adobe Labs (Previously Macromedia labs) is a source for early looks at emerging products 
and technologies from Adobe-Macromedia, including downloads of the latest software and 
plugins. Flash 9, Flex 2, and ActionScript 3.0 are discussed. 

The important new development in Flash is its increasing use in providing the presentation 
layer in handheld devices. Adobe is aggressively courting cell phone and PDA vendors, and 
partnering to deploy Flash Lite as the user interface. 

A project currently in development at Adobe Labs is the Apollo Project which is a cross-OS 
runtime that allows developers to reuse their existing web development skills (Flash, Flex, 
HTML, Ajax) to build and deploy desktop Rich Internet Applications (RIAs). While features of 
Apollo are still being fully defined, the project aims to be made available in public beta form in 
early 2007, with final release planned for later that year. 

The next version of Flash will have two additional components designed for large scale 
implementation. Adobe is adding in the option to require an ad to be played in full before the 
main video piece is played. This would be most useful for large scale video sites. Also, Adobe 
has announced plans to add DRM into the new version of Flash. This way Adobe can give 
companies the option to link an advertisement with content and make sure that both are played 
and that they are not changed. "! 


Programming language 


Main article: ActionScript 

Initially focused on animation, early versions of Flash content offered few interactivity 
features and thus had very limited scripting capability. 

More recent versions include ActionScript, an implementation of the ECMAScript standard 
which therefore has the same syntax as JavaScript, but in a different programming framework 
with a different associated set of class libraries. ActionScript is used to create almost all of the 
interactivity (buttons, text entry fields, pick lists) seen in many Flash applications. 

New versions of the Flash Player and authoring tool have strived to improve on scripting 
capabilities. Flash MX 2004 introduced ActionScript 2.0, a scripting programming language 
more suited to the development of Flash applications. It is often possible to save a lot of time by 
scripting something rather than animating it, which usually also retains a higher level of 
editability. 

Of late, the Flash libraries are being used with the XML capabilities of the browser to render 
rich content in the browser. Since Flash provides more comprehensive support for vector 
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graphics than the browser and because it provides a scripting language geared towards 
interactive animations, it is being considered a viable addition to the capabilities of a browser. 
This technology, which is currently in its nascent stage, is known as Asynchronous Flash and 
XML, much like AJAX, but with possibly greater potential. 


Content protection 


Many times, Flash authors will decide that while they desire the advantages that Flash 
affords them in the areas of animation and interactivity, they do not wish to expose their images 
and/or code to the world. However, once an .swf file is saved locally, it may then quite easily be 
decompiled into its source code and assets. Some decompilers are capable of nearly full 
reconstruction of the original source file, down to the actual code that was used during creation. 

In opposition to the decompilers, SWF obfuscators have been introduced to provide a 
modicum of security, some produced by decompiler authors themselves. The higher-quality 
obfuscators use traps for the decompilers, making some fail, but none have definitively been 
shown to protect all content. 


Competition 
Format and plug-in 


Compared to other plug-ins such as Java, Acrobat Reader, QuickTime or Windows Media 
Player, the Flash Player has a small install size, quick download time, and fast initialization time. 
However, care must be taken to detect and embed the Flash Player in (X)HTML in a W3C 
compliant way. A simple and widely used workaround is provided below: 


<object data="movie.swf" type="application/x-shockwave-flash" width="500" height="500"> 
<param name="movie" value="movie.swf"/> </object> 


More Information on how to detect and embed Flash Objects in a W3C compliant way is 
provided in the xSWF description. 

The use of vector graphics combined with program code allows Flash files to be smaller, or 
streams to use less bandwidth, than the corresponding bitmaps or video clips. For content in a 
single format (such as just text, video or audio) other alternatives may provide better 
performance and consume less CPU power than the corresponding Flash movie, for example 
when using transparency or making large screen updates such as photographic or text fades. 

In addition to a vector-rendering engine, the Flash Player includes a virtual machine called 
the ActionScript Virtual Machine (AVM) for scripting interactivity at run-time, support for 
video, MP3-based audio, and bitmap graphics. As of Flash Player 8, it offers two video codecs: 
On2 Technologies VP6 and Sorenson Spark, and run-time support for JPEG, Progressive JPEG, 
PNG, and GIF. In the next version, Flash is slated to use a just-in-time compiler for the 
ActionScript engine. 
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Flash as a format has become very widespread on the desktop market. According to a NPD 
study, 98% of US Web users have the Flash Player installed,"! with 45%-56%"" (depending on 
region) having the latest version. Numbers vary depending on the detection scheme and research 
demographics. 

Flash players exist for a wide variety of different systems and devices. Flash content can run 
consistently on Microsoft Windows, Mac OS, and Linux (Macromedia has created or licensed 
players for the following operating systems: Windows, Mac OS 9/X, Solaris, HP-UX, Pocket 
PC, OS/2, QNX, Symbian, Palm OS, BeOS, and IRIX). See also Macromedia Flash Lite for 
Flash compatibility on other devices. 

Adobe offers the specifications of the Flash file format (excluding specifications of related 
formats such as AMF) to developers who agree to a license agreement that permits them to use 
the specifications only to develop programs that can export to the Flash file format. The license 
forbids the use of the specifications to create programs that can be used for playback of Flash 
files. 


Free software alternatives 


There is, as of late 2006, no complete free software replacement which offers all the 
functionality of the latest version of Adobe Flash. Gnash, based on GameSWF, is a Flash player 
replacement that is under development and has the support of Free Software Foundation (FSF). 
Gnash supports Flash 7 and below, but not files that require version 8 or 9 features. Open Source 
projects aim to create a flash development environment. 

A full end-to-end implementation of the W3C SVG and SMIL specifications would offer 
close competition for most of the features of Flash in an open, standard way. Adobe used to 
develop and distribute the 'Adobe SVG Viewer' client plug-in for MS Internet Explorer, but has 
recently announced its discontinuation.” It has been noted by industry commentators[3] that this 
is probably no coincidence at a time when Adobe has moved from competing with Macromedia's 
Flash, to owning the technology itself. Meanwhile, Opera has supported SVG since version 8,'*! 
and Firefox's built-in support for SVG continues to grow."! 


Third-party software 


Since Flash files do not depend on an open standard such as SVG, this reduces the incentive 
for non-commercial software to support the format, although there are several third party tools 
which use and generate the SWF file format. IrfanView is capable of playing SWF files. There is 
a large and vibrant open source community. Flash Player cannot ship as part of a pure open 
source, or completely free operating system, as its distribution is bound to the Macromedia 
Licensing Program and subject to approval. 

Authoring 

In October 1998, Macromedia disclosed the Flash Version 3 Specification to the world on its 
website. It did this in response to many new and often semi-open formats competing with SWF, 
such as Xara's Flare and Sharp's Extended Vector Animation formats. Several developers quickly 
created a C library for producing SWF. February 1999 saw the launch of MorphInk 99, the first 
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third party program to create SWF files. Macromedia also hired Middlesoft to create a freely 
available developers' kit for the SWF file format versions 3 to 5. 

Today, several open and free libraries and tool sets exist to generate and manipulate SWF 
files on many platforms. These include the Ming library, SWFTools, and the combination of 
swfmill and MTASC. 

Macromedia has made the Flash Files specifications for versions 6 and later available only 
under a non-disclosure agreement, but it is widely available from various sites. 

Many shareware developers produced Flash creation tools and sold them for under $50 USD 
between 2000 and 2002. In 2003 competition and the emergence of free Flash creation tools, 
most notably OpenOffice.org Impress, had driven many third-party Flash-creation tool-makers 
out of the market, allowing the remaining developers to raise their prices, although many of the 
products still cost less than $100 USD and support ActionScript. As for open source tools, 
KToon can edit vectors and generate SWF, but its interface is very different from Macromedia's. 
Another, more recent example of a Flash creation tool is SWiSH Max made by an ex-employee 
of Macromedia. Toon Boom Technologies also sells traditional animation tool, based on Flash - 
Toon Boom Studio. 

Adobe wrote a software package called Adobe LiveMotion, designed to create interactive 
animation content and export it to a variety of formats, including SWF. LiveMotion went 
through two major releases, but failed to gain any notable user base. Cartoon Man X Studios is 
one of the studios that uses this software. 

In February 2003, Macromedia purchased Presedia, which had developed a Flash authoring 
tool that automatically converted PowerPoint Files into Flash. Macromedia subsequently 
released the new product as Breeze, which included many new enhancements. Since that time, 
Macromedia has seen competing PowerPoint-to-Flash authoring tools from PointeCast (not to be 
confused with PointCast) and PresentationPro among others. In addition, (as of version 2) 
Apple's Keynote presentation software also allows users to create interactive presentations and 
export to SWF. 

In April of 2006, the Macromedia Flash SWF file format specification was released with 
details on the then newest version format (Flash 8). Although still lacking specific information 
on the incorporated video compression formats (On2, Sorenson Spark, etc.), this new 
documentation covers all the new features offered in Flash v8 including new ActionScript 
commands, expressive filter controls, and so on. The file format specification document is 
typically obtainable by subscribing to Macromedia's membership system and license restrictions 
(which include a prohibition against using these specifications to develop a free alternative). 


Microsoft Silverlight 

Formally released at the National Association of Broadcasters (NAB) 2007 and formerly 
under the code name Windows Presentation Foundation/Everywhere or WPF/E, Silverlight 
appears to be Microsoft's entry into the digital delivery of interactive content. A benefit for 
Microsoft is that they can push the player down to the majority of their users through the 


automatic update feature of their current operating systems."'”! 


Criticisms 
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Usage 


Due to the increase in the use of Flash in aggressive—and even intrusive—online 
advertising, tools have emerged that restrict Flash content in some or all websites by temporarily 
or permanently turning Flash Player off depending on user requirements. Examples of such tools 
are Flashblock and Adblock for the Mozilla Firefox browser, Internet Explorer browser, Safari 
browser, all of which are commonly used in conjunction with each other to allow users to control 
what Flash content they see. 

Many websites rely on Flash being available by default on a user's web browser and will not 
check to see if it is available. If Flash is not installed, users may be unable to access some Flash- 
dependent websites or site features. These sites sometimes depend on a fast internet connection, 
especially a highly complex website or one with music. While it is possible to see Flash-based 
sites with a slower form of internet, such as dial-up, or a slow form of DSL, it may be frustrating 
for the user. Blocking tools generally do alert the end user to the fact that Flash content is present 
on the site, allowing the user to view it if they wish. 

Flash stores its content in a binary file that is not easily accessible by search engine crawlers. 
However, this problem can be alleviated with coding techniques to detect if a client is capable of 
viewing the flash content and showing standard markup code if it is not. 

Like most new technologies that are easy to learn, Flash Developers will have trouble 
developing content that is innovative, yet intuitive enough for novice users. Over the years, more 
Flash Developers have learned to appreciate customer focus while creating content that takes 
advantage of the latest features. 


Using Flash to restrict access to content 


Many content producers use Flash as a way to limit user's access to the media displayed in 
their browsers, and/or gain clicks by forcing extra steps to display. For example, in Windows, 
Shockwave/Flash (.swf) files cannot be right-clicked and saved. Famously, YouTube furnishes 
all video in flash video format (.flv), requiring users to turn to third-party solutions to store the 
content locally. The usage is now spreading to photo sharing websites such as Webshots. A Flash 
overlay exists over the initial photo displayed, requiring a second click to retrieve the photo, 
slowing the experience considerably. However, if Flash is not installed, the image displays 
normally. 


Local Shared Objects 


Flash Players from version 989 can store and retrieve persistent data without offering any 
visible signs to the user—in a manner similar to that of cookies. It is possible to clear the 
temporary files that Flash stores on your computer either through the Flash website, or by 
clearing the files manually. The default storage location for LSOs is operating-system dependent. 
For Windows XP, the location is within each user's Application Data directory, under 
Macromedia\Flash Player\#SharedObjects. For Mac OS X the location is in each users Library 
directory under Preferences/Macromedia/Flash Player/#SharedObjects. On Linux the location is 
in each users directory: ~/.macromedia/Flash_Player/#SharedObjects. 
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Application flaws 


Specially crafted files have been shown to cause Flash applications to malfunction, by 
allowing the execution of malevolent code. The Flash Player has a long history of security flaws 
that expose computers to remote attacks. However, exploitation of these flaws has remained at 
the proof-of-concept stage and has not escalated into a real-world problem. 

In addition to entries in the Open Source Vulnerability Database, security advisories 
published in August 2002, December 2002, and November 2005 highlight three examples of 
reports about various Flash Player versions that allowed remote code execution. 


Accessibility issues 


The US Justice Department has stated in regard to the Americans with Disabilites Act: 

"Covered entities under the ADA are required to provide effective communication, 
regardless of whether they generally communicate through print media, audio media, or 
computerized media such as the Internet. Covered entities that use the Internet for 
communications regarding their programs, goods, or services must be prepared to offer those 
communications through accessible means as well." 

Currently, businesses are able to sidestep this mandate to a great degree because the 
increased cost and complexity associated with providing content to sight-challenged viewers. 
However, the same argument cannot be made about those with ADHD, since the advertisers are 
actively interfering with an already accessible website. 

Internet users who are visually-impaired, and who may rely on a screen reader, braille 
display, or using larger text sizes and/or high-contrast color schemes may find sites that make 
extensive use of Flash difficult or impossible to use.!''! While later versions (Flash Player 6 and 
onwards) support accessibility functions, site designers may not necessarily design the Flash 
content with these considerations in mind. 


Flash Player on various platforms 


The Adobe Flash Player is mainly optimized for the Windows 32 bit platform. 32 bit editions 
of version 9 are also available for Mac OS X and Linux. On other platforms, such as Solaris, 
there are currently no later releases than version 7. Adobe has been criticized for neglecting to 
optimize its products on non-Microsoft platforms. This has led to poor web surfing performance 
on Macintosh and Linux computers, since many websites use Flash animations for menus and 
advertisements.[12][13] Flash Player 7 for Linux was very CPU hungry in fullscreen mode, 
resulting in low Frame rates. 

Adobe has rewritten the bitmap drawing routines in Flash Player 8 for Mac, using OpenGL 
planes via Quartz to draw the surfaces. The new drawing code is reported to be actually faster 
than its Windows counterpart, where JPEG, TIFF or other bitmap images are composited into the 
animation. 

The Linux version of the Flash Player requires the Advanced Linux Sound Architecture 
(ALSA) to output sound. ALSA was introduced in Linux 2.5 and can only be used with recent 
sound hardware. Users of the Open Sound System must either compile and install the abstraction 
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layer flashsupport provided by Adobe, run the Windows Flash Player in a Windows browser 
through WINE, or switch to ALSA, which involves upgrading the kernel and typically the sound 
card. However, it is questionable how many of the Linux users would be able to identify this 
problem. On Flash Player 7 for Linux, the sound could lag about a second behind the picture; 
this issue was resolved in Flash Player 9. Flash Player 8 was never released for Linux, Adobe 
stated that they would skip that version and instead focus on preparing Flash Player 9. This 
decision led to disappointment in the Linux community, with some people feeling that Adobe 
had abandoned the Linux market. Flash Player 9 for Linux was released in January 2007, 
providing platform parity once again. On Linux, it is generally not possible to scroll a web page 
while the mouse cursor is held over a flash animation (On some distributions, it is possible by 
using the arrow keys after a mouse click on the page outside the flash). 

Adobe has yet (as of April 2007) to release a Flash Player for the x86-64 architecture on any 
operating system[14]. There is to date no Linux Flash Player for non-x86 compatible processors 
(e.g. x86-64 native, PowerPC, ARM, etc.). Adobe employees have said the Flash implementation 
is very 32-bit specific and porting to 64-bit systems would require a lot of effort . Adobe is 
currently working on a 64-bit version''’!. Adobe have not yet released any of their development 
software for any UNIX-like operating system except Mac OS X. 

Although the myth is that Flash content cannot be indexed properly by search engine "bots" 
the problem has been corrected, or at least alleviated, with Flash CS3. Users can publish Flash 
CS3 with meta tags that index individual Flash movie content, such as links, which can be used 
by search engine bots to properly categorize the information. In addition, with the 
implementation of XML and other external scripting languages (like PHP) flash can now run 
websites at greater speeds than their graphic-heavy counterparts (some applications can run on as 
few as 50K; about the size of one image). 


Digital Rights Management 

According to a recent article from BBC News, the latest iteration of Flash allows copyright 
holders to embed ads within videos, as well as control how those videos are used. “! With this 
latest piece of software, companies will be able to quickly remove any video that they feel 


violates copyright and force advertisements to play prior to the start of the video. 


Market share 


According to a Millward Brown survey, conducted March 2007, Adobe claims Flash reaches 
98.7% of desktop Internet users.) Independent market share data is not available because the 
several companies who periodically gather browser usage data do not measure Flash penetration. 


Related file formats and extensions 


Ext. Explanation 


Swf .swf files are completed, compiled and published files that cannot be edited with Adobe 
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Flash. However, many '.swf decompilers' do exist. Attempting to import .swf files using 
Flash allows it to retrieve some assets from the .swf, but not all. 
.fla files contain source material for the Flash application. Flash authoring software can edit 
FLA files and compile them into .swf files. 
.as files contain ActionScript source code in simple source files. FLA files can also contain 
Actionscript code directly, but separate external .as files often emerge for structural reasons, 
or to expose the code to versioning applications. They sometimes use the extension 
.actionscript 
.swd files are temporary debugging files used during Flash development. Once finished 
developing a Flash project these files are not needed and can be removed. 
.asc files contain Server-Side ActionScript, which is used to develop efficient and flexible 
client-server Macromedia Flash Communication Server MX applications. 
. .flv files are Flash video files, as created by Adobe Flash, ffmpeg, Sorenson Squeeze, or 
On?2 Flix. 
swe files are used for distributing components; they contain a compiled clip, the 
component's ActionScript class file, and other files that describe the component. 
. g -Jsfl files are used to add functionality in the Flash Authoring environment; they contain 
Javascript code and access the Flash Javascript API. 
Swt .swt files are 'templatized' forms of .swf files, used by Macromedia Generator 
.flp files are XML files used to reference all the document files contained in a Flash Project. 
fp Flash Projects allow the user to group multiple, related files together to assist in Flash 
project organization, compilation and build. 
Spl .spl files are FutureSplash documents. 
.aso files are cache files used during Flash development, containing compiled ActionScript 
byte code. An ASO file is recreated when a change in its corresponding class files is 
a86 detected. Occasionally the Flash IDE does not recognize that a recompile is necessary, and 
these cache files must be deleted manually. They are located in %7 USERPROFILE%\Local 
Settings\Application Data\Macromedia\Flash8\en\Configuration\Classes\aso on Win32 / 
Flash8. 


fla 


fl 


Video in web pages 


Flash is increasingly used as a way to display video clips on web pages, a feature available 
since Flash Player version 7. As a video format, Flash is liked for its ability to be displayed 
inside browser windows, not for its in this sense relatively limited platform compatibility. While 
there are other video file types that can be played on more platforms, typically 'out of the box' on 
the majority of operating systems, their browser plugins are limited in distribution. In contrast, 
the Adobe Flash Player is a dedicated browser plugin (also available as standalone player) and 
offers very good platform compatibility compared to other browser plugins. It is available for 
many popular platforms, including Windows, Mac OS X and, to some extent, Linux. Flash is 
used as the basis for many popular video sites, including YouTube and Google Video. 

Flash Video (.flv files) is a container format, meaning that it is not a video format in itself, 
but can contain other formats. The video in Flash is encoded in H.263, and starting with Flash 
player 8, it may alternatively be encoded in VP6. The audio is in MP3. The use of VP6 is 
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common in many companies, because of the large adoption rates of Flash Player 8 and Flash 
Player 9.!"6 


See also 


e Actionscript 
e Flash animation 
e Adobe Flex 


Adobe Flex | Flash cartoons | Action Message Format | ActionScript | Adobe Shockwave | 
Animation portal | SWF 


Adobe Flex 


Adobe Flex is a software development kit and an IDE for a group of technologies initially 
released in March of 2004 by Macromedia to support the development and deployment of cross 
platform, rich Internet applications based on their proprietary Macromedia Flash platform. 

In April 2007, Adobe announced plans to open source Flex, specifically the Flex SDK, 
although the Flash underpinnings and the Flex Builder IDE still remain proprietary and 
commercial. 


Overview 


Traditional application programmers found it challenging to adapt to the animation metaphor 
upon which the Flash Platform was originally built. Flex seeks to minimize this problem by 
providing a workflow and programming model that is familiar to application developers. 

Flex was initially released as a J2EE application or JSP tag library that compiles MXML and 
ActionScript on-the-fly into Flash applications (binary SWF files). Later versions of Flex support 
the creation of static files that are compiled at authoring time and can be deployed online without 
the need for a server license. 

The goal of Flex is to allow Web application developers to quickly and easily build rich 
Internet applications. In a multi-tiered model, Flex applications serve as the presentation tier. 

Flex features development of graphic user interfaces using an XML-based language called 
MXML. Flex comes with various components and features that make capabilities such as web 
services, remote objects, drag and drop, sortable columns, charting/graphing, built in animation 
effects, and other interface interactions simple. Since the client only loads once, application 
workflow is significantly improved versus HTML based applications (eg. PHP, ASP, JSP, 
CFMX) which require executing templates on the server with every action. Flex's language and 
file structure are seeking to decouple application logic from design. 

The Flex server also acts as a gateway to allow the client to communicate with XML Web 
Services and Remote Objects (such as Coldfusion CFCs, Java Classes, and anything else that 
supports the Action Message Format). 

Commonly mentioned as alternatives to Flex are OpenLaszlo, Ajax and Windows 
Presentation Foundation technologies. 
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Initial Releases (Flex Server 1.0 and 1.5) 


Initial releases of Flex were targeted towards the enterprise application development market, 
and were priced around US$15000 per CPU. Each license included 5 Flex Builder licenses. As 
for using this vs ActionScripting, it's better if you're a pure developer. 


Adobe Flex 2 


Flex 2 changes the licensing model to open room to a free version of the technology, called 
"Flex 2 SDK". 

The new Flex Builder 2 is based on the Eclipse IDE. Enterprise-oriented services are 
available through Flex Data Services 2 providing data synchronization, data push, publish- 
subscribe and automated testing. 

Flex 2 introduces the use of a new version of the ActionScript scripting language, 
Actionscript 3, reflecting the latest ECMAScript specification and requires Flash Player 9 or later 
for the runtime. 

Flex was the first Macromedia product to be rebranded as Adobe. 


Adobe Flex 3 (beta) 


Adobe released the first beta of Flex 3 in June 2007. Major enhancements include integration 
with the new versions of its Creative Suite products, support for the new Adobe application 
runtime, AIR, and the addition of profiling and refactoring tools to the Flex Builder IDE. A more 
complete description and free downloads are available on the Adobe Labs site. 


LiveCycle Data Services 


LiveCycle Data Services (previously called Flex Data Services) is a server-side complement 
to the main Flex SDK and Flex Builder IDE and is part of a family of server-based products 
available from Adobe. Deployed as a Java EE application, LiveCycle Data Services adds the 
following capabilities to Flex applications: 

e Remoting, which allows Flex client applications to invoke methods on Java 
server objects directly. Similar to Java remote method invocation (RMI), 
remoting handles data marshalling automatically and uses a binary data 
transfer format. 

e Messaging, which provides the "publish" end of the "publish/subscribe" design 
pattern. The Flash client can publish events to a topic defined on the server, 
subscribe to events broadcast from the message service. One of the common use 
cases for this is real-time streaming of data, such as financial data or system 
status information. 

e Data management services, which provides a programming model for 
automatically managing data sets that have been downloaded to the Flex client. 
Once data is loaded from the server, changes are automatically tracked and can 
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be synchronized with the server at the request of the application. Clients are 
also notified if changes to the data set are made on the server. 
e PDF document generation, providing APIs for generating PDF documents by 
merging client data or graphics with templates stored on the server. 
Adobe provides a free version of LiveCycle Data Services which is limited to a single CPU, 
and doesn't support clustering. The full version costs $6,000 USD per CPU (limited to 100 
concurrent users) or $20,000 USD per CPU (no limit). [1] 


Flex and ColdFusion 


Flex 2 offers special integration with ColdFusion MX 7. The ColdFusion MX 7.0.2 release 
adds updated Flash Remoting to support ActionScript 3, a Flex Data Services event gateway, and 
the Flex Data Services assembler. Flex Builder 2 also adds extensions for ColdFusion providing 
a set of wizards for RAD Flex development. A subset of Flex 1.5 is also embedded into 
ColdFusion MX 7 middleware platform, for use in the ColdFusion Flash forms feature. It is 
possible to use this framework to write rich internet applications, although its intended purpose is 
for rich forms only. 


Flex Application Development Process 


Everything below is directly sourced from the help file in version 2.0 Beta 3: 

e Define an application interface using a set of pre-defined components (forms, 
buttons, and so on) 

e Arrange components into a user interface design 

e Use styles and themes to define the visual design 

e Add dynamic behavior (one part of the application interacting with another, for 
example) 

e Define and connect to data services as needed 

e Build the source code into a SWF file that runs in the Flash Player 


Release history 


e Flex 1.0 - March 2004 

e Flex 1.5 - October 2004 

e Flex 2.0 (Alpha) - October 2005 

e Flex 2.0 Beta 1 - February 2006 

e Flex 2.0 Beta 2 - March 2006 

e Flex 2.0 Beta 3 - May 2006 

e Flex 2.0 FINAL - June 28, 2006 

e Flex 2.0.1 - January 5, 2007 

e Flex 2.0.1 available as open source - April 26, 2007 
e Flex 3.0 Beta 1[Moxie] - June 11, 2007 


Notable sites using Flex 


136 


NICOLAE SFETCU: WEB DESIGN & DEVELOPMENT 


Notable websites using Flex include: 
e Pikeo online photo sharing 
e Amsterdam-Schiphol Airport 
e Sherwin-Williams Color Visualizer 
e Sony Ericsson Product Catalogue 
e Yahoo! Maps 
e Yahoo! Messenger for the Web 
e FineTracker - Professional webiste statistics 
There is a showcase of Flex applications at the Flex Showcase. 


Comparison between AJAX and Flex 


Flash cartoons 


A Flash animation or Flash cartoon is an animated film which is created using Adobe 
Flash animation software and often distributed in the .swf file format. It can be created in Flash 
or with other programs capable of writing .swf files. The term Flash animation not only refers to 
the file format but to a certain kind of movement and visual style which, in many circles, is seen 
as simplistic or unpolished. However, with dozens of Flash animated television series, countless 
more Flash animated television commercials, and award-winning online shorts in circulation, 
Flash animation is enjoying a renaissance. 

In the late 1990s, when for most Internet users, bandwidth was still at 56 kbit/s, many Flash 
animation artists employed limited animation or cutout animation when creating projects 
intended for web distribution. This allowed artists to release shorts and interactive experiences 
well under Imb, which could stream both audio and high-end animation. One example is the first 
episode of The Goddamn George Liquor Program released in 1999, rendered at only 628kb. 

Some hallmarks of poorly-produced Flash animation are jerky natural movements (seen in 
walk-cycles and gestures), auto-tweened character movements, lip-sync without interpolation, 
and abrupt changes from front to profile view. Although Flash is able to integrate bitmaps and 
other raster-based art, as well as video, most Flash films are created using only vector-based 
drawings which often result in a somewhat clean graphic appearance. 

Flash animations are typically distributed by way of the World Wide Web, in which case 
they are often referred to as Internet cartoons, online cartoons, or webtoons. Web Flash 
animations may be interactive and are often created in a series. A Flash animation is 
distinguished from a Webcomic, which is a comic strip distributed via the Web, rather than an 
animated cartoon. 

Today, Flash animations are being more and more widely used in the multiuser community 
with flash generators and embeded videos in user's profiles across the internet. Many popular 
remakes are appearing over and over again with the growing success in a less pioneered subject. 
Youtubes are poping up over the internet with google video and myspace video, and flash 
generators are too numerous to count. Many of the flash generators are now used for webmasters 
in menu creators but the dynamic lipsyncing flash characters and font to image manipulators are 
still the most dominant. 
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Simple animation in Flash MX; a square moving across the screen in a motion tween, one of 
the basic functions of Flash. 


History 


The first prominent use of the Flash animation format was by Ren & Stimpy creator John 
Kricfalusi. He embarked on a mission to bring cartoons to the Internet. Kricfalusi employed 
George Liquor (a fictional character rumored to have ended Kricfalusi's employment on Ren & 
Stimpy) and his dim-witted nephew Jimmy the Hapless Idiot Boy on their own Internet program 
titled The Goddamn George Liquor Program. Later, John produced more animated projects with 
Flash including several online shorts for Icebox.com, television commercials, and a music video. 
Soon after that, web cartoons began appearing everywhere. 

The Von Ghouls went live in November 1999, featuring the first music group with cartoon 
episodes online including original songs, in the vein of Saturday morning cartoons of the 1970s. 
A number of popular portal sites featured Flash animation during the dot-com boom of the late 
1990s, including Icebox, MondoMedia, CampChaos, MediaTrip, and AtomFilms. Stan Lee of 
Marvel Comics launched an animated comics site. 

The Internet also saw the proliferation of many adult-only Flash cartoon sites. Some of the 
shows from that period made the transition to traditional media, including Queer Duck, Gary the 
Rat, Happy Tree Friends, the politically-minded JibJab shorts and the popular Homestar Runner. 
Occasionally, the trend has been reversed: after being canceled from both ABC and Fox, Atom 
Films created net-only episodes of The Critic in 2000-2001. In another instance, Flash almost 
made the transition to the big screen. In 2001, production began on what would have been the 
first Flash-animated feature film, the ill-fated "Lil' Pimp," which also began life as an Internet 
series. As potentially controversial as its subject matter was, it had a relatively large budget, a 
number of well-known actors (including William Shatner, Bernie Mac, and Lil Kim), a full crew, 
and a running time of nearly 80 minutes. Although Sony Pictures decided not to release the film, 
it was eventually released on DVD by Lion's Gate. 

Several recording companies experimented with releasing animated music videos to promote 
their artists' releases online, including Madonna, the Beastie Boys and Tenacious D, however 
none became the hit that allowed for the expansion of Flash animated music videos. Adam 
Sandler and Tim Burton among others, released original Internet-only animated works, but were 
not able to devise successful financial models and the trend dissipated, largely as a result of a 
lack of viable micropayment systems. 

Several popular online series are currently produced in Flash, such as the Emmy Award- 
winning Off-Mikes, produced by ESPN and Animax Entertainment and Gotham Girls, produced 
by Warner Brothers. 

Many animated television series are produced using Macromedia Flash, inspired by both the 
comparatively low cost of production and the unique style that can be achieved with the 
software, including Metalocalypse, Being Ian, Foster's Home For Imaginary Friends, Hi Hi 
Puffy AmiYumi, Happy Tree Friends, Odd Job Jack, Wow! Wow! Wubbzy!, the BBC Three show 
Monkey Dust, Yin Yang Yo, Aaagh! It's the Mr. Hell Show and Queer Duck from Showtime, and 
Shorties Watching Shorties on Comedy Central. 
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Other television series, such as Home Movies and Harvey Birdman, Attorney at Law, both 
broadcast on Cartoon Network's Adult Swim programming block, have switched to Flash from 
other animation technology. 

Many animation film festivals have responded to the popularity of Flash animation by adding 
separate categories in competition for "web cartoons" or "Internet cartoons." Additionally, 
several exclusively web-based Flash competitions have been established. It is speculated that 
only the category "made for Internet" will survive, as competitions at animation film festivals are 
typically arranged in categories defined by film length and distribution channel, rather than by 
animation techniques or tools used to create the films. 


Timeline - First Flash Projects on Television 


Year Project Notes 
. For the 2000 and 2001 seasons, the show opens were created by 


the BOS a ikeue Art, which has spun off into a company called Magic Butter. 
2000 O'Donnell Show - : : : eae : 
‘ : These titles were nominated for a daytime Emmy for 'Main Title 
Opening Titles ae 
Design. 
This aired on the Oxygen Network in March 2000 as part of the "X- 
2000 Breakup Girl Chromosome" block a sort of female Liquid Television overseen by 


Kit Laybourne and Machi Tantillo. 
Ollie's Under TheThis was a half-hour special on Teletoon in March, which eventually 
Bed Adventures spawned the series now known as Olliver's Adventures. 
In November of 2001, this series launched on Australia's Special 


2001 


! 
eo nuaes: Broadcasting Service network. 
3001 TV Warehouse The title sequence for this BBC program was produced in Flash by 
Elephant Egg. 
One hundred minutes of a series of Public Service Annoncements 
2002 Meena (PSA's) for the project Meena were animated by Future Thought 


Productions for UNICEF Asia. 


The Proud amie episodes of The Proud Family were animated at Hyperion 


2002 Sheree Pictures under the Animobile label. Two episodes were animated by 
Animax Entertainment (The Beach and The Picnic). 
2002 Home Movies The second season of this series was produced in Flash. 


2002 ;Mucha Lucha! The first US network Flash series. 

2002 The Mr. Dink Show Canada’s first broadcast Flash production. 
2002 Bobinogs On BBC Two Wales. 

200? Xiao Xiao #3 A stickfigure animation shown on MTV 
200? Stroker and Hoop — Canceled Adult Swim series. 


Timeline - Other Flash Animated TV Series 


YearProject Notes 
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Chilly Launched on CBC Television, this became one of the first Flash productions to 
Beach make the move from online "webisodes" to national TV. 


2006 oh Yang The Second Disney series made entirely in Flash. 


2003 


A Flash series based on a series of online shorts produced by VOOZ in South 
Korea. The TV series is produced by Studio B in Canada. 
2006 Chaotic A Flash series based on the story of the original Chaotic Trading Card Game. 


2006 Pucca 


Timeline - First Flash Feature Film Projects (Theatrically Released) 


Year Project Notes 
The Gai Directed by Bryon E. Carson, starring the voices of Blair Underwood and 
2005 Biavé Michael Clarke Duncan, had a limited theatrical run making it the first 
flash animated film to be released on the big screen. 
Romeo & Juliet: Former Disney animator Phil Nibbelink took 4 1/2 years to make it and he 
2006 Sealed with adrew 112,000 frames with a Wacom tablet directly into Flash 4, in 
Kiss combination with Moho. 
An animated segment of nineteen minutes produced at an aspect ratio of 
2006 That Darn Jesus 1.85:1 and HD 1080 (1920 px X 1080 px) for the movie Universal 
Remote by Future Thought Productions. 


Flash Animation Distribution 


While the creation of animation using Flash can be easier and less expensive than traditional 
animation techniques, the amount of time, money, and skill required to produce a project using 
the software depends on the chosen content and style. Internet distribution is considerably easier 
and less expensive than television broadcasting, and websites such as Newgrounds and 
UGOplayer provide free hosting. Many Flash animations are created by individual or amateur 
artists, although it does require some amount of technical knowledge to create a notable work 
with the software. Many Flash animations first distributed on the web became popular enough to 
be broadcast on television, particularly on such networks as MTV and G4TV. 


Flash Animation in Professional Studios 


Flash animation production is enjoying considerable popularity in major animation studios 
around the world, as animators take advantage of the software's ability to organize a large 
number of assets (such as characters, scenes, movements, and props) for later re-use. Because 
Flash files are in vector file format, they can be used to transfer animation to 35 mm film without 
any compromise in image quality. This feature is used by several independent animators world- 
wide, including Phil Nibbelink, who saw his 77-minute feature film Romeo & Juliet: Sealed with 
a Kiss released in theaters in 2006. 
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Creating Flash animation from other software 


There are a number of other software packages available that can create output in the .swf 
format. Among these are Animo, (Cambridge Animation Systems), Toon Boom Studio, Toufee, 
Celaction and Anime Studio (previously named Moho from Lost Marble, now eFrontier). These 
front-ends often provide additional support for creating cartoons, especially with tools more 
tailored to traditionally-trained animators, as well as additional rigging for characters, which can 
speed up character animation considerably. Additionally, there are programs available which 
translate 3D information into 2D vectors. There are loads of flash designers all around the globe, 
of which quite a few are mentionable. Neostream appointed 8 designers worldwide to create their 
site of which the leading designer was ABM Kamran, a boy of 17 from Bangladesh. 

OSFlash hosts a number of different open source methods of dealing with Flash animation. 


Action Message Format 


AMF is a binary format based loosely on the Simple Object Access Protocol (SOAP). It is 
used primarily to exchange data between a flash application and a database, using a Remote 
Procedure Call. 

Each AMF message contains a body which holds the error or response, which will be 
expressed as an ActionScript Object. 


Data Types 


AMF includes several core data types which are always present throughout the serialization 
of data: 

e Byte - A simple 8-bit chunk of data. This is the simplest way to send AMF data. 

e Int - A 16-bit number made of two consecutive bytes. The first byte is the most 
significant byt, and the second byte is the least significant byt. 

e MediumInt - A 24-bit number, identical to the above except for the length. 

e Long - A 32-bit number, same as above, but longer. 

e Double - A 64-bit number made up of 8 consecutive bytes. It represents a 
floating point, signed number. The double is little-endian encoded. 

e UTF8 - A UTFS8 string longer than 216 bytes. It is made of a an integer (two 
bytes) representing the string length, followed by the UTF8-encode string. 

e LongUTF8 - A UTF8 string possible longer than 2%16 bytes. It consists of a long 
integer (four bytes) representing the string length, followed by the UTF8 
encoded string. 


ActionScript 


Paradigm: 


Object-oriented 
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Appeared in: 
1998 
Designed by: 
Gary Grossman 
Developer: 


Macromedia, Adobe Systems 


Latest release: 

3.0 / June 27, 2006 

Typing discipline: 

Strongly, Statically, Safely typed 
Major implementations: 


Adobe Flash, Adobe Flex 


Influenced by: 


JavaScript, Java 


OS: 


Cross-platform 


File extension: 


-aS 


MIME type: 
application/actionscript” 


ActionScript is a scripting language based on ECMAScript, used primarily for the 
development of websites and software using the Adobe Flash Player platform (in the form of 
SWE files embedded into Web pages). Originally developed by Macromedia, the language is 
now owned by Adobe (which acquired Macromedia in 2005) and continues its development. 
ActionScript was initially designed for controlling simple 2D vector animations made in Adobe 
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Flash (formerly Macromedia Flash). Later versions added functionality allowing for the creation 
of Web-based games and rich Internet applications with streaming media (such as video and 
audio). 


History 


ActionScript started as a scripting language for the Macromedia Flash authoring tool. The 
first three versions of the Flash authoring tool provided limited interactivity features. Early Flash 
developers could attach a simple command, called an "action", to a button or a frame. The set of 
actions was limited to basic navigation controls, with commands such as "play", "stop", "get 
URL", and "goto and play". 

With the release of Flash 4 in 1999, this simple set of actions matured into a small scripting 
language. New capabilities introduced for Flash 4 included variables, expressions, operators, if 
statements, and loops. Although referred to internally as "ActionScript", the Flash 4 user manual 


and marketing documents continued to use the term "actions" to describe this set of commands. 
Time line by player 


e Flash Lite 1.0 and 1.1: Flash Lite is the Flash technology specifically developed 
for mobile phones and consumer electronics devices. Flash Lite 1.1 supports 
Flash 4 ActionScript. 

e Flash Lite 2.1: Added support for Flash 7 ActionScript 2.0. 

e Flash Player 2: First version with scripting support, actions included 
gotoAndPlay, gotoAndStop, nextFrame and nextScene for timeline control. 

e Flash Player 3: Expanded basic scripting support with the ability to load 
external SWFs (loadMovie). 

e Flash Player 4: First player with a full scripting implementation (called 
Actions). The scripting was a slash based syntax and contained support for 
loops, conditionals, variables and other basic language constructs. 

e Flash Player 5: Included the first version of true ActionScript. Used Prototype- 
based programming based on ECMAScript, and allowed full Procedural 
programming and Object-Oriented programming. 

e Flash Player 6: Added an event handling model, and support for switch. 

e Flash Player 7: Flash Player 7 offered some new features such as CSS text and 
performance improvements. Macromedia Flash compilers released alongside 
Flash Player 7 also support ActionScript 2.0, a Class programming language 
based on the ECMAScript 4 Netscape Proposal. However, ActionScript 2.0 can 
cross compile to ActionScript 1.0 byte-code, so it can be run by Flash Player 6. 

e Flash Player 8: Further extended ActionScript 2.0 by adding new class libraries 
with APIs for controlling bitmap data at run-time, and file-upload. 

e Flash Player 9 (initially called 8.5): Added ActionScript 3.0 with the advent of 
a new virtual machine, called AVM2 (ActionScript Virtual Machine 2), which 
coexists with the previous AVM1 needed to support legacy content. 
Performance increases were a major objective for this release of the player 
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including a new JIT compilation. This is the first release of the player to be titled 
Adobe Flash Player. 


Time line by ActionScript version 


2000-2003: ActionScript ''1.0" With the release of Flash 5 in September 2000, the 
"actions" from Flash 4 were enhanced once more and named "ActionScript" for the first time."! 
This was the first version of ActionScript with influences from JavaScript and the ECMA-262 
(Third Edition) standard, supporting the said standard's object model and many of its core data 
types. Local variables may be declared with the var statement, and user-defined functions with 
parameter passing and return values can also be created. Notably, ActionScript could now also 
by typed with a text editor rather than being assembled by choosing actions from drop-down lists 
and dialog box controls. With the next release of its authoring tool, Flash MX, and its 
corresponding player, Flash Player 6, the language remained essentially unchanged; there were 
only minor changes, such as the addition of the switch statement and the "strict equality" (===) 
operator, which brought it closer to being ECMA-262-compliant. Two important features of 
ActionScript that distinguish it from later versions are its loose type system and its reliance on 
prototype-based inheritance. Loose typing refers to the ability of a variable to hold any type of 
data. This allows for rapid script development and is particularly well-suited for small-scale 
scripting projects. Prototype-based inheritance is the ActionScript 1.0 mechanism for code reuse 
and object-oriented programming. Instead of a class keyword that defines common 
characteristics of a class, ActionScript 1.0 uses a special object that serves as a "prototype" for a 
class of objects. All common characteristics of a class are defined in the class's prototype object 
and every instance of that class contains a link to that prototype object. 

2003-2006: ActionScript 2.0 The next major revision of the language, ActionScript 2.0, was 
introduced in September 2003 with the release of Flash MX 2004 and its corresponding player, 
Flash Player 7. In response to user demand for a language better equipped for larger and more 
complex applications, ActionScript 2.0 featured compile-time type checking and class-based 
syntax, such as the keywords class and extends. (While this allowed for a more flexible object- 
oriented programming approach, the code would still be compiled to ActionScript 1.0 bytecode, 
allowing it to be used on the preceding Flash Player 6 as well. In other words, the class-based 
inheritance syntax was a layer on top of the existing prototype-based system.) With ActionScript 
2.0, developers could constrain variables to a specific type by adding a type annotation so that 
type mismatch errors could be found at compile-time. ActionScript 2.0 also introduced class- 
based inheritance syntax so that developers could create classes and interfaces, much as they 
would in class-based languages such as Java and C++. This version conformed partially to the 
ECMAScript Fourth Edition draft specification. 

2006-today: ActionScript 3.0 In June 2006, ActionScript 3.0 debuted with Adobe Flex 2.0 
and its corresponding player, Flash Player 9. ActionScript 3.0 was a fundamental restructuring of 
the language, so much so that it uses an entirely different virtual machine. Flash Player 9 
contains two virtual machines, AVM1 for code written in ActionScript 1.0 and 2.0, and AVM2 
for content written in ActionScript 3.0. ActionScript 3.0 provides not only a significant 
enhancement in performance, but also a more robust programming model that lends itself to 
complex Rich Internet Application development. 
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The update to the language introduced several new features: 

e Compile-time and runtime type checking—type information exists at both 
compile-time and runtime. 

e Improved performance from a class-based inheritance system separate from 
the prototype-based inheritance system. 

e Support for packages, namespaces, and regular expressions. 

e Compiles to an entirely new type of bytecode, incompatible with ActionScript 
1.0 and 2.0 bytecode. 

e Revised Flash Player API, organized into packages. 

e Unified event handling system based on the DOM event handling standard. 

e Integration of ECMAScript for XML (E4X) for purposes of XML processing. 

e Direct access to the Flash runtime display list for complete control of what gets 
displayed at runtime. 

e Completely conforming implementation of the ECMAScript Fourth Edition Draft 
specification. 


Syntax 


ActionScript code is free form and thus may be created with whichever amount or style of 
whitespace that the author desires. 


ActionScript 2.0 


When one is working with the Macromedia Flash IDE, it is possible to use the trace() 
function to print information in the application's output terminal. If one uses this command, then 
a very short "legal" program in ActionScript would be the following code on frame 1 of layer 1 
in an otherwise empty Flash document: 
trace("Hello, world!"); 

However, due to this command being unused in other interpreters of the language, such as 
the standard Adobe Flash Player, it is reasonable to suggest that this does not suffice. For such a 
reason, one can consider the following code instead, which works in any compliant player, as the 
shortest program instead:!*! 
createTextField("greet", 0, 0, 0, 100, 100); greet.text = "Hello, world!"; 

This creates a text field at depth 0, at location 0, 0 px on the screen, that is 100 px wide and 
high. Then the "text" parameter is set to the "Hello, world!" string, and it is automatically 
displayed in the player. 

When writing external ActionScript 2.0 class files the above example could be written in a 
file named Greeter.as as following. 
class com.example.Greeter extends MovieClip { public function Greeter() { var 
txtHello:TextField = this.createTextField("txtHello", 0, 0, 0, 100, 100); txtHello.text = "Hello 
World"; } } 


ActionScript 3.0 
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Minimal ActionScript 3.0 programs are somewhat larger and more complicated due to the 
inherently different structure of an application, and the increased separation of the programming 
language and the Flash IDE. 

Presume the following file to be Greeter.as: 

"package com.example { import flash.text.TextField; import flash.display.Sprite; public 
class Greeter extends Sprite { public function Greeter(){ var txtHello:TextField = new 
TextField(); txtHello.text = "Hello World"; addChild(txtHello); } } }" 

Finally, an example of using ActionScript when developing Flex applications, again 

presuming the following content to be in a file named Greeter.as: 


package { 

public 

class Greeter { 

public 

static 

function sayHello() : String 

{ 

var greet:String = "Hello, world!"; return greet; } 
} 

} 


This code will work with the following MXML application file: 
<?xml version="1.0" encoding="utf-8"?> <mx:Application 
xmlns:mx="http://www.adobe.com/2006/mxml" xmlns="*" layout="vertical" 
creationComplete="initApp()"> <mx:Script> <![CDATA[[ public function initApp() : void { 
// Prints our "Hello, world!" message into "mainTxt". mainTxt.text = Greeter.sayHello(); } 
]]>  </mx:Script> <mx:Label id="title" fontSize="24" fontStyle="bold" text="\"Hello, 
world!\" Example"/> <mx:TextArea id = "mainTxt" width="250"/> </mx:Application> 


Examples 
ActionScript 2.0 Examples 


The following prints Hello world into the Flash IDE output panel. Note this will only work 
when run inside the Flash IDE, as the trace function is only supported inside it. 
trace ("Hello world!"); 

The following code prints Hello world to the stage. 
_root.createTextField("myText", 1, 0,0,100,25);_root.myText.text = "Hello world!"; 

The following code outputs the current mouse position when the mouse moves, by using the 
onMouseMove event. Again this will only work in the Flash IDE. 
onMouseMove = function () { trace ("X: "+_root._xmouse); trace ("Y: "+_root._ymouse); }; 
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The following code, when placed on a Button, begins to play the timeline of the MovieClip 
the button is placed in. This also applies to the root timeline since the root is considered a 
MovieClip as well. 
on(release) { play(); } 

This more advanced example creates an array containing numbers and strings, and assigns a 
number to a variable called num and a string to a variable called str using prototype functions 
and function recursion. Then, using the MovieClip API, a text field is drawn on screen, into 
which the variable values are written. 
var myArray:Array = new Array ("Hello", "ActionScript", 3, 7, 11, "Flash"); 
Array.prototype.pickNumber = _ function():Number { var rand:Number = = 
random(this.length); return (typeof (this[rand]) == "number") ?  this[rand] 
this.pickNumber(); }; Array.prototype.pickString = function():String { var rand:Number = 
random(this.length); return (typeof (this[rand]) == "string") ? this[rand] : this.pickString(); 
\; var num:Number = myArray.pickNumber(); var str:String = myArray.pickString(); 
_root.createTextField ("txt", 1, 10, 10, 530, 390); txt.text = "Array = "+myArray+"\nRandom 
Number = "+num+"\nRandom String = "+str; 

Array and dataProvider example: 
var aData:Array = [{name: "J. Bell", age: "55"}, {name: "B. Longman", age: "21"}]; 
dataGrid.dataProvider = aData; 


ActionScript 3.0 Examples 


This Hello World example uses ActionScript 3.0: 


package { 

import flash.display.Sprite; import flash.text.TextField; import flash.filters. DropShadowFilter; 
public 

class HelloWorld2 extends Sprite { 

public 

function HelloWorld2() 

{ 


var shad:DropShadowFilter = new DropShadowFilter (2, 45, OxOO0000, 25, 3, 3, 2, 2); var 
txt:TextField = new 

TextField(); txt.textColor = OxFFFFFF; txt.filters = [shad]; txt.width = 120; txt.x = 
Math.random()*300; txt.y = Math.random()*300; txt.selectable = false; txt.text = "Hello World 
welcome! ["+Math.round(txt.x)+","+Math.round(txt.y)+"]"; addChild(txt); } 

} 

} 


Data structures 


Data types 
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ActionScript primarily consists of "fundamental" or "simple" data types which are used to 
create other data types. These data types are very similar to Java data types. Since ActionScript 3 
was a complete rewrite of ActionScript 2, the data types and their inheritances have changed 


ActionScript 2 top level data types 


String - A list of characters such as "Hello World" 

Number - Any Numeric value 

Boolean - A simple binary storage that can only be "true" or "false". 

Object - Object is the data type all complex data types inherit from. It allows for 
the grouping of methods, functions, parameters, and other objects. 


ActionScript 2 complex data types 
There are additional "complex" data types. These are more processor and memory intensive 
and consist of many "simple" data types. For AS2, some these data types are: 


MovieClip - An ActionScript creation that allows easy usage of visible objects. 
TextField - A simple dynamic or input text field. Inherits the Movieclip type. 
Button - A simple button with 4 frames (states): Up, Over, Down and Hit. 
Inherits the MovieClip type. 

Date - Allows access to information about a specific point in time. 

Array - Allows linear storage of data. 

XML - An XML object 

XMLNode - An XML node 

LoadVars - A Load Variables object allows for the storing and send of HTTP 
POST and HTTP GET variables 

Sound 

NetStream 

NetConnection 

MovieClipLoader 

EventListener 


ActionScript 3 top level data types 


Boolean - The Boolean data type has only two possible values: true and false. 
No other values are valid. 

int - The int data type is a 32-bit integer between -2,147,483,648 and 
2,147,483,647. 

Null - The Null data type contains only one value, null. This is the default value 
for the String data type and all classes that define complex data types, including 
the Object class. 

Number - The Number data type can represent integers, unsigned integers, and 
floating-point numbers. The Number data type uses the 64-bit double-precision 
format as specified by the IEEE Standard for Binary Floating-Point Arithmetic 
(IEEE-754). 

String - The String data type represents a sequence of 16-bit characters. Strings 
are stored internally as Unicode characters, using the UTF-16 format. Previous 
versions of flash used the UTF-8 format. 
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uint - The uint (Unsigned Integer) data type is a 32-bit unsigned integer 
between 0 and 4,294,967,295. 

void - The void data type contains only one value, undefined. In previous 
versions of ActionScript, undefined was the default value for instances of the 
Object class. In ActionScript 3.0, the default value for Object instances is null. 


ActionScript 3 complex data types 


Object - The Object data type is defined by the Object class. The Object class 
serves as the base class for all class definitions in ActionScript. 

Array - Contains a list of data. The items in the list can be of any type, even 
other arrays. 

Date 

Error 

Function 

RegExp 

XML 

XMLList 


Using Data Types 


Using these data types in ActionScript is quite simple. The basic syntax is: 
var YourVariableName:YourVariableType = new YourVariableType(Param1, Param2, ..., 


ParamN); 


So in order to make an empty Object: 
var MyObject:Object = new Object(); 

Some types are automatically put in place: 
var MyString = "Hello Wikipedia!"; // This would automatically set the variable as a string. 
var MyNumber = 5; // This would do the same for a number. var MyObject = {Param1:"Hi!", 
Param2:76}; //This creates an object with two variables. // Param1 is a string with the 
data of "Hi!", // and Param2 is a number with the data of 76. var MyArray = [5,"Hello!",{a:5, 
b:7}] //This is the syntax for automatically creating an Array. //It creates an Array with 3 
variables. //The first (0) is a number with the value of 5, //the second (1) is a string with 
the value of "Hello!", //and the third (2) is an object with {a:5, b:7}. 


Criticisms 


Slow Compiler - Complaints have long been levelled at Adobe that Flash IDE's 
compiler is slow. The claim is debatable but several alternative compilers exist 
now such as Haxe, MTASC, and the Flex 2 compiler. 

Proprietary language - Some of Adobe's releasing of the SWF file format to the 
open source community has been criticized as too little 

Components and backward-compatibility - Two versions of components 
cause compatibility problems 

Weak typing - Actionscript syntax is very tolerant and therefore more robust 
than some programming languages. For example, ActionScript 2 allows 
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programmers to implement strict datatyping, but does not force them to do so. 
It also fails silently in most cases, so error-catching must be custom-built by the 
programmer. (ActionScript 3.0 resolves this issue) 

Performance - Flash's ActionScript VM tends to hit a ceiling quickly in regards 
to the amount of computation that ActionScript can perform before triggering 
an internal timeout, especially on the Mac Flash Player. Simply counting the 
numbers from 1 to 5000, for instance, threatens to exceed the capacity of the 
Flash Player for some users. 


“ RFC 4329 

* Note that the name "ActionScript 1.0" is a retronym, coined after the release 
of ActionScript 2.0. 

* Note that this is also valid ActionScript "1.0", the original scripting language of 
Flash. However, it does not produce a visible text box in Flash Player versions 
lower than 6, as an embedded font is required. 


Adobe Flash — The program in which ActionScript debuted. 
Adobe Flex 
SWF File Format 


Adobe Shockwave 


Adobe Shockwave Player 


Developer: 


Adobe Systems 


Latest release: 


10.2.0.022 / 2007-05-07 


OS: 


Microsoft Windows, Mac OS X (PowerPC only) 


Platform: 


Web browsers 
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Genre: 


Multimedia Player / MIME type: application/x-director 


Website: 
Adobe Shockwave Player 


Adobe Shockwave (formerly Macromedia Shockwave) was Macromedia's first and most 
successful multimedia player prior to the introduction of Macromedia Flash (now Adobe Flash). 
In an attempt to raise its brand profile all Macromedia players prepended Shockwave to their 
names in the late 1990s. Although this campaign was very successful and helped establish 
Shockwave Flash as a dominant multimedia plugin, Shockwave and Flash became more difficult 
to maintain as two separate products. In 2005, Macromedia marketed three distinct browser 
player plugins under the brand names Macromedia Authorware, Macromedia Shockwave and 
Macromedia Flash. 

Although Shockwave was designed for making a wide variety of online movies and 
animations, its actual use has become concentrated in the area of game development. Other 
features not replicated by Flash include a much faster rendering engine, including hardware- 
accelerated 3D, blend modes for layered display of graphic assets and support for various 
network protocols, including Internet Relay Chat. Furthermore Shockwave's functionality can be 
extended with so-called "Xtras". 

Unlike Flash, the Shockwave browser plugin is not available for Linux or Solaris despite 
vocal lobbying efforts.[1] However, the Shockwave Player can be installed on Linux with 
Codeweavers' Crossover Office,[2] a program which allows some Windows-based applications 
to be installed and run on Linux. According to Adobe, Macromedia Shockwave Player is 
available on 52.3% of Internet-enabled PCs.[3] It uses .DCR files created using the authoring 
tool Macromedia Director. 


Animation portal 


An animation portal is a system where a user is able to submit his or her own animation in a 
video file format to a website and have it immediately be displayed to other visitors. Other 
visitors will have the privilege to rate it (typically on a scale of 1 through 10), and give a suitable 
comment followed by custom ratings such as originality, sound, storyline, graphics, etc. Most 
existing animation portals currently only accept submissions in the .swf file format, and these 
can also be subcategorized under the name flash portals. 

By far the biggest website containing an animation portal (a flash portal, in this case) is 
Newgrounds, which boasts over a million members and has over 500,000 submissions. Other 
websites featuring flash portals include the Stick Figure Death Theatre (SFDT). An example of a 
website which is an animation portal but not a flash portal is StopMoShorts.com[1], which 
focuses on stop motion animation and accepts most file types except .swf. 


History 
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As animated films have become more sophisticated and animators more technologically 
sophisticated, even amateur animators can achieve high quality results and lengthy animations 
lasting an hour or more were given enough time and effort. Even so, newcomers to the online 
community can still find it difficult for their own site to gain popularity by means of a portal, 
leading to a decline in the amateur teenage animation community. Surprisingly, to date, no 
animation portal script has ever been made, acting as a barrier to newcomers to the area. There 
has been, however, speculation that E2-Productions”! will be releasing a free animation portal 
script to the public soon. 

Internet animations evolved from being frail drawings of stick men to using a wide variety of 
different styles as artists began testing the limits of the Flash software. At the peak of stick men 
animation (mostly known as stick anims using tools such as Pivot), stick sites began popping 
simultaneously around the internet, showcasing animations such as the hit series Joe Zombie, by 
Rob Den Bleyker. However, these "stick sites" have been on the decline in favor of higher 
quality animations. 


Technology 
Most animation portals are run by a MySQL database, though it is said to be possible to 
create one using just PHP "flat files". A portal run by Flat Files would be much harder to create 


and update, but it is probably possible if you have commitment. There are no known with a 
portal run by flat files as of now. 


SWE 


File extension: 


swt 


MIME type: 


application/x-shockwave-flash 


Developed by: 


Macromedia (Adobe Systems) 


Type of format: 
Vector graphic image 


SWF is a proprietary vector graphics file format produced by the Flash software from Adobe 
(formerly Macromedia). Intended to be small enough for publication on the web, SWF files can 
contain animations or applets of varying degrees of interactivity and function. SWF is also 
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sometimes used for creating animated display graphics and menus for DVD movies, and 
television commercials. 

The Flash program produces SWF files as a compressed and uneditable final product, 
whereas it uses the .fla format for its editable working files. 

The name is a backronym of sorts, standing for Small Web Format and Shockwave Flash". 
According to Adobe, SWE is pronounced "S W F" (with each letter being pronounced 
individually), but some people prefer to pronounce it as "swiff"[2] or "swaif". A file of this 
format is called a Shockwave Flash Object. SWF is currently the dominant format for displaying 
animated vector graphics on the web, far exceeding the W3C open standard SVG, which has met 
with problems over competing implementations. 


Description 


Originally limited to presenting vector based objects and images in a simple sequential 
manner, the newer versions of the format allow audio, video and many different possible forms 
of interaction with the end user. Once created, SWF files can be played by the Adobe Flash 
Player, working either as a browser plugin or as a standalone player. SWE files can also be 
encapsulated with the player, creating a self-running SWF movie called a "projector". 

The file format was first created by a small company called FutureWave which was later 
acquired by Macromedia and had one main goal: create small files for displaying entertaining 
animations. The idea was to have a format which could be reused by a player running on any 
system and which would work with slower network (such as a browser used with a modem). 

Plugins to play SWF files in web browsers are available from Adobe for most desktop 
operating systems, including Microsoft Windows, Apple Mac, and Linux on the x86 architecture. 
Adobe claims that over 97% of web users now have an SWF plugin installed [3], based on an 
independent study conducted by NPD Research. Sony PlayStation Portable consoles can play 
limited SWF files in its web browser but this can only be found on the firmwares 2.71 and up. 
Nintendo's Wii console can run SWF files through its Opera browser. 

A free software implementation of a SWF player is gnash, which as of 2007 is undergoing 
intensive development. 


Licensing 


Although a full specification of SWF is available, it is not an open format, as implementing 
software that plays the format is disallowed by the specification’s license. Reverse engineering is 
therefore the only legal way to compete with the official SWF player. Implementing software 
which creates SWF files is permitted, on the condition that the resulting files render "error free 
in the latest publicly available version of Adobe Flash Player." "! 

The free software SWF player gnash is being developed by GNU under the GNU General 
Public License (GPL 


Content management system 


A Content Management System (CMS) is a software system used for content management. 
Content management systems are deployed primarily for interactive use by a potentially large 


153 


NICOLAE SFETCU: WEB DESIGN & DEVELOPMENT 


number of contributors. For example, the software for the website Wikipedia is based on a 
content management system."! 

The content managed includes computer files, image media, audio files, electronic 
documents and web content. The idea behind a CMS is to make these files available inter-office, 
as well as over the web. A Content Management System would most often be used as archival as 
well. Many companies use a CMS to store files in a non-proprietary form. Companies use a CMS 
to share files with ease, as most systems use server based software, even further broadening file 
availability. As shown below, many Content Management Systems include a feature for Web 
Content, and some have a feature for a "workflow process." 

"Work flow" is the idea of moving an electronic document along for either approval, or for 
adding content. Some Content Management Systems will easily facilitate this process with email 
notification, and automated routing. This is ideally a collaborative creation of documents. A 
CMS facilitates the organization, control, and publication of a large body of documents and other 
content, such as images and multimedia resources. 

A web content management system is a content management system with additional features 
to ease the tasks required to publish web content to web sites. 

Web content management systems are often used for storing, controlling, versioning, and 
publishing industry-specific documentation such as news articles, operators' manuals, technical 
manuals, sales guides, and marketing brochures. A content management system may support the 
following features: 

e Import and creation of documents and multimedia material 

e Identification of all key users and their content management roles 

e The ability to assign roles and responsibilities to different content categories or 
types. 

e Definition of the content workflow tasks, often coupled with event messaging so 
that content managers are alerted to changes in content. 

e The ability to track and manage multiple versions of a single instance of content. 

e The ability to publish the content to a repository to support access to the 
content. Increasingly, the repository is an inherent part of the system, and 
incorporates enterprise search and retrieval. 

e Some content management systems allow the textual aspect of content to be 
separated to some extent from formatting. For example the CMS may 
automatically set default color, fonts, or layout. 

Content management systems take the following forms: 

e a web content management system is software for web site management - 
which is often what is implicitly meant by this term 

e the work of a newspaper editorial staff organization 

e aworkflow for article publication 

e adocument management system 

e asingle source content management system - where content is stored in chunks 
within a relational database.'! 


Terminology 
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The following terms are often used in relation to web content management systems but they 

may be neither standard nor universal: 

Block 
A block is a link to a section of the web site. Blocks can usually be specified to appear on 
all pages of the site (for example in a lefthand navigation panel) or only on the home 
page. 

Module 
A content module is a section of the web site, for example a collection of news articles, a 
FAQ section, etc. Some content management systems may also have other special types 
of modules, for example administration and system modules. 

Theme 
A theme specifies the cosmetic appearance of every page of the web site, controlling 
properties such as the colours and the fonts. 

Module-based CMS 
Most tasks in a document's life-cycle are served by CMS modules. Common modules are 
document creation/editing, transforming and publishing. 

Document transformation language-based CMS 
Another approach to CMS building with use of open standards. XSLT-based CMS 
compile ready documents from XML data and XSLT-template. XML Sapiens-based CMS 
compile a document from the stream of ‘pure’ data, design template and functionality 
template. 

Web-based CMS 
Another approach to CMS building uses databases such as PostgreSQL, MySQL or MS 
SQL, and scripting languages or tools such as ColdFusion, PHP, JSP or ASP to interact 
with the data to parse them into visual content. Data stored in a database are queried 
and compiled into html pages or other documents and transformed using cascading 
style sheets. These systems can include a number of other functions, such as discussion 
boards, blogs, or email newsletters. 


See also 


e List of content management systems 
e Enterprise Content Management 


Blog software | Open source content management systems | Version control systems | Wiki | 
Zope | List of content management systems | List of content management frameworks | 
Document management system | Enterprise content management | Geospatial Content 
Management System | Web content 


Blog software 


Weblog software (also called blog software or blogware) is a category of software which 
consists of a specialized form of content management systems specifically designed for creating 
and maintaining weblogs. 
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Server models 


Many weblog applications are available for users to download and install on their own 
systems. Some of these are open-source software that can be used, modified, and redistributed f 

Other weblog applications are offered only through their developers’ hosts, either free of 
charge or for a fee. These typically include hosting service for the published blog itself, but some 
offer the option of using this hosted software to update a blog published elsewhere. 


Clients 


Maintenance through the Internet is a nearly universal feature of weblog software. This is 
usually done through a browser-based interface, enabling the author to update the site using the 
same client software used to view its contents. Some software supports the use of specialized 
client software to update sites, such an applet running on an internet-connected workstation or on 
a PDA, or a client that can run offline and/or interface with a variety of weblog server 
applications. 


Features 


Title, the main title, or headline, of the post. 
Body, main content of the post. 
e Permalink, the URL of the full, individual article. 
e Post Date, date and time the post was published. 
A blog entry optionally includes the following: 
e Comments - 
Comments are a way to provide discussion on blog entries. Readers can leave a comment on 
a post, which can correct errors or contain their opinion on the post or the post's subject. Services 
like coComment aim to ease discussion through comments, by allowing tracking of them. 
e Categories (or tags) - subjects that the entry discusses 
e Trackback and or pingback - links to other sites that refer to the entry 


Other applications 


Most weblog applications have features such as facilitating authoring and editing of blog 
posts or articles, various linking and web syndication features, and the ability to easily publish 
the blog to the world wide web. Some services or organizations are also creating weblog 
applications with extended features to aid communication, such as the wiki capabilities in 
Socialtext and Traction TeamPage. 

Many weblog applications allow the user to define static pages of content which can often be 
placed into a hierarchy or tree. Pages differ from blog posts in that the content is largely static 
and not time related. Pages are often used to present information about the blog and its author. 
Extensive use of pages can result in a blog that looks more like a website. [1] 

Most weblog applications support English and many other languages. The user selects a 
language during installation. Weblog applications usually offer web syndication service either in 
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the form of RSS or Atom. This allows for other software such as feed aggregators to maintain a 
current summary of the blog's content. 

Post moderation requires the people who want to comment on articles that are posted on a 
blog to be approved before the comments are visible to the world. It could also mean in some 
cases where multiple people have accounts and the ability to post new items to the blog that new 
content must be approved by a moderator or administrator before it shows up on the main page. 
Weblog applications use various user account systems that allow readers to post comments to a 
particular blog. For instance, users with Blogger accounts may comment on any Blogger blog. 
Other weblog applications allow users to post content or comments only to blogs where they 
have an account. 

The Post API can vary greatly depending on the system that you are using. Some types of 
blogware have plugins for Firefox that integrate into the browser's menus so that right-clicking 
on selected text on any given webpage will bring up a small window that allows the user to post 
to their blog. Other types of blogware that do not have this type of interface require a person to 
fill out a form online. The form that is required for posting material to a blog depends on the type 
of blogware. Some types such as Movable Type contain a greater number of form fields and 
choices than ones such as Blogger. 

All types of blogware support adding thumbnail images within blog posts. Photo blogging is 
a separate genre of blogging that deals primarily with images. 


Documentation and support 


Different blogware packages feature varying levels of community support and 
documentation. Because the installation of some types of software requires an advanced 
knowledge of computer administration, community support and documentation can be very 
helpful. The web servers and database software can be more difficult to install than the blogware 
itself. Also a strong and active community surrounding the blogware gives advice on integrating 
the blogware into a personal site. 


Examples 


A partial list of notable weblog software follows: 
User-hosted 


Software packages installed by weblog authors to run on their own systems: 
Free/Libre/Open-Source software 
These software packages are offered under an Open Source Software license. Therefore they 
are free for everyone to use. Also, commercial support contracts are often available. 
e Diarios Online (Windows) 
Apache Roller (Java-based) 
e b2evolution (PHP/MySQL) 
e bBlog (PHP/MySQL) 
Beta-Blogger (PHP/MySQL or text files) 
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BlogEngine.NET (C#/ASP.NET) 

blojsom (Java-based) 

blosxom (Perl) 

boastMachine (PHP/MySQL) 

Collablog (PHP/MySQL) 

dasBlog (C#/ASP.NET) 

PyBlosxom (Python) 

Dotclear (PHP/MySQL) 

Drupal (PHP/MySQL) 

Elgg (Linux, Apache, MySQL, and PHP) 
FlatPress (PHP/Flat Files) 

Geeklog (PHP/MySQL) 

Horizon Blogging Software (PHP/MySQL) 
Livejournal (Perl) (Also available, developer hosted) 
LifeType (PHP/MySQL) 

Mephisto (Ruby on Rails) 

Nucleus CMS (PHP) 

Pivotlog (PHP) 

Serendipity (PHP/MySQL,PostgreSQL,SQLite) 
Simplelog (Ruby on Rails) 

SimplePHPBlog (PHP/Flat Files) 

Slash (Perl) 

Subtext (C#/ASP.NET) 

Textpattern (PHP/MySQL) 

Typo (Ruby on Rails) 
WordPress (PHP/MySQL) - also offered in developer-hosted form as 
Wordpress.com 


Proprietary software 


These packages are under a proprietary license. They may require the purchase of a license 
key to use them. The specific licensing terms vary but some are free for personal or non- 
commercial use. 


Community Server (also offered in developer-hosted form as 
CommunityServer.com) 

entity CMS 

ExpressionEngine (formerly pMachine) 

Movable Type (also offered in developer-hosted form as TypePad) 

Radio UserLand 

Traction TeamPage 

Windows Live Writer - also offered in developer-hosted form as spaces.live.com 


Unknown license 
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e Battle Blog 
Blogsphere 


Developer-hosted 


Software services operated by the developer, requiring no software installation for the 
weblog author: 
e Atom5 
Battle Blog 
Blog.com 
Blogger 
BlogHarbor 
Blogging Systems 
Blue Kaffee 
EFx2.com 
DeadJournal 
GreatestJournal 
LiveJournal 
Multiply.com 
MySpace 
NireBlog.com 
NooBlog.pl 
Open Diary 
punditPending.com 
TypePad 
Typo 
Windows Live Spaces 
WeBlog.ro 
Webmunism 
Wordpress 
AftaBlog.com 
Xanga 
Vox 


WordPress | Drupal | Joomla! | LifeType | Movable Type | Serendipity | TYPO3 | TypePad 


WordPress 


Developer: 


Matt Mullenweg 
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Ryan Boren 


Donncha O Caoimh 


Latest release: 


2.2.2 / 5 August 2007 


OS: 


Cross-platform 


Platform: 


PHP 


Genre: 


Blog publishing system 


License: 


GNU General Public License 


Website: 
http://wordpress.org/ and http://wordpress.com/ 


WordPress is a blog publishing system written in PHP and backed by a MySQL database. 
WordPress is the official successor of b2\cafelog, developed by Michel Valdrighi. The name 
WordPress was suggested by Christine Selleck, a friend of lead developer Matt Mullenweg. 

The latest release of WordPress is version 2.2.2, released on 5 August 2007."" It is distributed 
under the GNU General Public License. 


Features 


e Generates standards-compliant XML, XHTML, and CSS 
e Integrated link management 

e Search engine-friendly permalink structure 

e Extensible plugin support 

e Nested categories and multiple categories for articles 
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e TrackBack and Pingback 

e Typographic filters for proper formatting and styling of text 
e Static Pages 

e Multiple Authors 

e Supports LaTeX 1 


History 


b2, the precursor to WordPress, was also written in PHP for use with MySQL by Michel 
Valdrighi, who is now a contributing developer to WordPress. Though WordPress is the official 
successor, another project, b2evolution, is also in active development. 

WordPress first appeared in 2003 as a joint effort between Matt Mullenweg and Mike Little 
to create a fork of b2."! 

In 2004 the licensing terms for the competing Movable Type package was changed by Six 
Apart, and many of its users migrated to WordPress - causing a marked, and continuing, growth 
in WordPress's popularity."! 


Releases 


WordPress releases are named after well known jazz musicians. WordPress 1.0 was 
codenamed Mingus (after Charles Mingus). 

WordPress 1.5 was released mid-February 2005 and codenamed Strayhorn after Billy 
Strayhorn. It added a range of new vital features. One such is being able to manage static pages. 
This allows content pages to be created and managed outside the normal blog chronology and 
has been the first step away from being simple blog management software to becoming a full 
content management system. Another is the new template/theme system, which allows users to 
easily activate and deactivate "skins" for their sites. WordPress was also equipped with a new 
default template (codenamed Kubrick after the late Stanley Kubrick!) designed by Michael 
Heilemann. 

WordPress 2.0 was released in December 2005 and codenamed Duke after jazz pianist and 
composer Duke Ellington. This version added rich editing, better administration tools, image 
uploading, faster posting, an improved import system, and completely overhauled the back end. 
WordPress 2.0 also offered various improvements to plugin developers.'*! 

On 22 January 2007, another major upgrade, WordPress 2.1, codenamed Ella after jazz 
vocalist Ella Fitzgerald, was released. In addition to correcting security issues, version 2.1 
featured a redesigned interface and enhanced editing tools (including integrated spell check and 
auto save), improved content management options, and a variety of code and database 
optimizations. 

WordPress 2.2, codenamed Getz after tenor saxophonist Stan Getz, was released on 16 May 
2007. Version 2.2 featured widget support for templates, updated Atom feed support, and speed 
optimizations.[7] Wordpress 2.2 was initially slated to have a revised taxonomy system for 
categories, as well as tags, but a proposed revision led to the feature being held back from 
release. '*! 


161 


NICOLAE SFETCU: WEB DESIGN & DEVELOPMENT 


Vulnerabilities 


In January 2007, many high profile Search engine optimization (SEO) blogs, as well as many 
low-profile commercial blogs featuring Adsense were targeted and attacked with a WordPress 
exploit.!! 

A separate vulnerability on one of the project site's web servers allowed an attacker to 
introduce exploitable code in the form of a back door to some downloads of WordPress 2.1.1. 
The 2.1.2 release addressed this issue; an advisory released at the time advised all users to 
upgrade immediately.!”! 

In May 2007, a study revealed that 98% of WordPress blogs being run are exploitable.!""! 

In a June 2007 interview, Stefen Esser, the founder of the PHP Security Response Team, 
spoke critically of WordPress's security track record, citing problems with the application's 
architecture that make it unnecessarily difficult to write code that is secure against SQL injection 
vulnerabilities, as well as other problems."”! 


WordPress MU 


WordPress supports one weblog per installation, though multiple concurrent copies may be 
run from different directories if configured to use separate database tables. 

WordPress MU is a fork of WordPress created to allow simultaneous blogs to exist within 
one installation. Wordpress MU makes it possible for any one with a website to host their own 
blogging community, control and moderate all the blogs from a single dashboard. Notable 
communities that use MU are WordPress.com and Harvard University.!"*) 


Developers 


WordPress development is led by Ryan Boren and Matt Mullenweg. Mullenweg and Mike 
Little were co-founders of the project. 
The contributing developers include: 
e Dougal Campbell 
e Mark Jaquith 
e Donncha O'Caoimh 
e Andy Skelton 
e Michel Valdrighi 
e Peter Westwood 
Though developed much by the community surrounding it, WordPress is closely associated 
with Automattic, where some of WordPress's main contributing developers are employees.'"“! 
WordPress is also in part developed by its community, among which are the WP testers, a 
group of people that volunteer time and effort to testing each release. They have early access to 
nightly builds, Beta versions and Release Candidates. Upgrading to these versions, they can find 
and report errors to a special mailing list, or the project's Trac tool. 


Sponsored themes 
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On 10 July 2007, following a post by Mark Ghosh in his blog Weblog Tools Collection'*, 
Matt Mullenweg announced that the official WordPress theme directory at 
http://themes.wordpress.net would no longer host themes containing sponsored links''*, Although 
this move was criticised by designers of sponsored themes, it was widely applauded by 
WordPress users, many of whom consider such themes to be spam.!"! 


See also 


e Comparison of content management systems 


Further reading 


e Douglass, Robert T.; Mike Little, Jared W. Smith (2005). Building Online 
Communities With Drupal, phpBB, and WordPress. New York: Apress. ISBN 1- 
59059-562-9. 

e Hayder, Hasin (2006). WordPress Complete. United Kingdom: Packt Publishing. 
ISBN 1-90481-189-2. 


Movable Type 
Movable Type 


Developer: 


Six Apart 


Latest release: 


3.35 / April 17, 2007 


Preview release: 


4.0 / June 5, 2007 


Platform: 


Perl 


Genre: 


Blog publishing system 


License: 
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proprietary (GPL version promised)" 


Website: 


www.sixapart.com/movabletype 


Movable Type is a weblog publishing system developed by California-based Six Apart. It 
was publicly announced on 3 September 2001,[2] and version 1.0 was publicly released on 8 
October 2001." 

Its TrackBack feature was introduced in version 2.2, and has since been adopted by a number 
of other blog systems. Movable Type also supports most popular weblogging features, including 
user accounts, comments, categories, and themes, and is extensible through a large library of 
third-party plug-ins. Movable Type is free of charge for personal blogs with unlimited authors 
and blogs; paid support and commercial licenses, which allow more authors and weblogs on a 
single installation of the software, are also available.! The current version is 3.35.5! 

Movable Type is written in Perl, and supports storage of the weblog's content and associated 
data within MySQL, Berkeley DB, PostgreSQL, and SQLite; dynamic page generation is 
handled by Smarty. Movable Type Enterprise also supports the Oracle database and Microsoft 
SQL Server. The application supports static page generation (in which files for each page are 
updated whenever the content of the site is changed), dynamic page generation (in which pages 
are composited from the underlying data as the browser requests them), or a combination of the 
two technologies. 

With the release of version 3.0, there were marked changes in Movable Type's licensing,” 
most notably placing greater restrictions on its use without paying a licensing fee.'! This sparked 
criticism from some users of the software. With the release of Movable Type 3.2, the ability to 
create an unlimited number of weblogs at all licensing levels was restored. In Movable Type 3.3, 
itis completely free for personal users. 

Six Apart released Movable Type 4 beta on June 5, 2007 and re-launched movabletype.org as 
a community site for purposes of developing an open-source version (MTOS) to be released 
under GPL in Q3 2007.10 

Movable Type Enterprise version has become a standard corporate and enterprise blogging 
software because of advanced features such as LDAP management, enterprise database 
integration such as Oracle, MySQL, user roles, blog cloning and automated blog provisioning. It 
is also available as part of Intel's SuiteTwo professional software offering of Web 2.0 tools. 

Six Apart also maintains three other weblog publishing systems, TypePad, LiveJournal and 
Vox. While Movable Type is a system which needs to be installed on a user's own web server, 
TypePad is a hosted weblog service. LiveJournal is an entirely separate blogging system, 
acquired after Six Apart's buyout of Danga Interactive in January 2005. 

Movable Type was originally named Serge after musician Serge Gainsbourg. 


Serendipity 


Serendipity 
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Developer: 

Serendipity Developer Team 
Latest release: 

1.1.3 /17 June 2007 

Genre: 

content management system 
License: 

BSD license 

Website: 


Serendipity 
Serendipity is a PHP based blog and web-based content management system. It is available 
under a BSD license. It supports PostgreSQL, MySQL, and SQLite database backends, the 


Smarty template engine, and a plugin architecture for user contributed modifications. 


See Also 


e Comparison of content management systems 


TypePad 


URL 


http://www.typepad.com/ 


Commercial? 


Yes 


Type of site 


Blog hosting 


Registration 


Optional 
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Owner 


Six Apart Ltd 


Created by 
Six Apart Ltd 

TypePad is a blogging service from company Six Apart Ltd, generally considered the largest 
paid blogging service in the world. Originally launched in October 2003, TypePad was based on 
Six Apart's Movable Type platform, and shares technology with Movable Type such as 
templates and APIs, but is marketed to non-technical users and includes additional features like 
multiple author support, photo albums and moblogging. 

The service is available in several languages and countries around the world, but in the 
United States is sold at three different subscription levels. 


TypePad is currently used by many large organizations and media companies to host their 
weblog, such as ABC, MSNBC, the CBC, the BBC and Sky News. 


Open source content management systems 


b2evolution | Drupal | e107 | eZ Publish | Joomla! | LifeType | Mambo | PHP-Nuke | 
PostNuke | TYPO3 | WordPress | XOOPS | Xaraya | Zope 


b2evolution 


b2evolution 


Developer: 


Francois Planque 


Latest release: 


1.10 / May 2, 2007 


OS: 


Cross-platform 


Genre: 


Blog software 
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License: 


GNU GPL 


Website: 


www.b2evolution.net 


b2evolution is a multi-lingual, multi-user, multi-blog publishing system written in PHP and 
backed by a MySQL database. It is distributed under the GNU General Public License and is 
available without charge. Francois Planque forked b2evolution from version 0.6.1 of b2\cafelog 
in 2003. 

b2evolution is focused on ease of installation and feature richness. It can easily be installed 
on almost any LAMP (Linux, Apache, MySQL, PHP) host in a matter of minutes. With the latest 
release, no configuration file editing is necessary, as all configuration is handled through the 
installer or administrative back-end. b2evolution allows an admin user to lock himself out 
permanently by redirecting the admin.php page to a nonexistent URL on the root domain. 

Features in current releases include: community-wide spam filters, in which many 
b2evolution sites aggregate and "tag" spammer IPs into a central blacklist for the benefit of all 
b2evolution blogs; a fully skinnable interface; localization into a dozen language packs; and a 
fully exposed API for plugin developers to add new functionality. 

Several popular plugins have been developed for b2evolution. These include text format 
extensions enabling Textile, Auto-P, Greymatter, BB code, Texturize, LaTeX, and graphic 
smilies. Also, plugins which facilitates full integration with third party tools such as Gallery 2, 
YouTube and digg are available. 


Drupal 
Latest release: 
5.2 / July 26, 2007 


OS: 


Cross-platform 


Genre: 


Content management framework, Content management system, Community and Blog 
software 


License: 


GPL 
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Website: 


http://drupal.org 


Drupal (IPA pronunciation: [druD p°Sk]) is a free and open source modular content 
management system written in the programming language PHP. A content management system 
(or CMS for short) is a web application that handles the displaying of content without any special 
knowledge (or in some cases none) of the language the system was programmed in, which in this 
case is PHP. 

Drupal is able to run on a number of different platform environments, assuming the system 
can run one of the two web server systems, Apache, or IIS. Since Drupal, like other content 
management systems, uses a database to store content and settings, it requires a database package 
such as MySQL or PostgreSQL. 

As of July 26, 2007, the current version of Drupal was 5.2." 


History 


Originally written by Dries Buytaert as a bulletin board system, Drupal became an open 
source project in 2001. Drupal is an English transliteration of the Dutch word “druppel,” which 
means “drop” (as in “a water droplet”). The name was taken from the now defunct Drop.org 
website, whose code slowly evolved into Drupal. Buytaert wanted to call the site “dorp” (Dutch 
for “village”, referring to its community aspects), but made a typo when checking the domain 
name and thought it sounded better."”! 

Over the years, Drupal has gained popularity. From May 2006 to April 2007, Drupal was 
downloaded from the Drupal.org website more than 600,000 times.') A large community now 
takes part in Drupal's ongoing development.'*! 


Content Management System 


Drupal has a basic layer, or "core", which provides essential features and supports pluggable 
modules that add additional functionality.” 
Modules included in Drupal's core enable users to: 
e Post, revise, and categorize content 
e Conduct searches 
e Post comments 
e Take part in forums 
e Vote in polls 
e Work on collaborative writing projects 
e Post and view personal profiles 
e Communicate among themselves or with the managers ofa site 
e Change the look of a site with off-the-shelf or custom-made themes 
e Build multi-level menus 
e Provide users with an interface in their local language 
e Provide RSS feeds 
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e Gather content from the RSS feeds of other sites 

e Register and manage user accounts 

e Assign fine-grained user roles, granting users permission to use selected 
features ofa site 

e Use access rules to deny site access to specified usernames, e-mail addresses, 
and IP addresses 

e Provide statistics and reports for administrators 

e Manage caching and throttling to improve how a site performs in heavy traffic 

e Construct and specify various input filters and content types 

e Generate user-friendly, easy-to-remember — URLs (for example, 
"www.mysite.com/products" rather than "www.mysite.com/?q=node/432) 

The version control system, also a core feature, tracks the details of content updates, tracking 
who changed it, what was changed, the date and time of changes made, and so on. The system 
provides for a comment log and enables users to roll back content to an earlier version. 

Users and administrators can employ core features without needing to know PHP or HTML. 


Modules 


Drupal's modular design allows people with knowledge of PHP to write modules to 
implement additional features. The Drupal website provides many hundreds of free modules 
written by Drupal users. 

These modules provide, for example, e-commerce systems, workflow features, photo 
galleries, organic groups, Google sitemaps, Amazon Items‘, mailing list management, and 
integration with a CVS. 

Integrating the modules with the core via a system of hooks, or callbacks, allows modules to 
insert functions into Drupal's path of execution. Drupal core provides protection against many of 
the usual security problems, like SQL injection. 


Themes 

Most themes for Drupal are written in the PHPTemplate engine!” or the XTemplate engine". 
Earlier templates used hard-coded PHP. 

Earlier versions of Drupal's theming system were criticized "! as being less design-oriented 
and more complicated than the systems for Mambo and Plone. The inclusion of the 
PHPTemplate and XTemplate engines in Drupal has addressed some of these criticisms. 
Translations 

As of August 2007, translations for Drupal's interface were available in 37 languages other 
than English (the default).!'! Supported languages include some that read right to left, such as 


Arabic and Hebrew. 


Criticism 


169 


NICOLAE SFETCU: WEB DESIGN & DEVELOPMENT 


The installation of Drupal (and its modules) requires access to a database as well as certain 
high-level privileges, including the ability to use SQL commands such as SELECT, INSERT, 
UPDATE, DELETE, CREATE, DROP, INDEX, ALTER, and LOCK TABLES. Some Web 
hosting providers, however, do not offer these features. Anyone who wishes to use Drupal 
should ensure that their host offers these features before they begin installation. 

As with other content management systems, one can set up the initial database by using the 
command line[1] or with phpMyAdmin/PhpPgAdmin. But since version 5.0 one can install 
Drupal and set up the database almost entirely from a web-based interface. 

Some! have considered Drupal more difficult to learn and slightly more difficult to install 
than some simple CMS programs or basic blogging tools such as WordPress. Drupal 5.0, 
released January 15, 2007, is packaged with a web-based installer to partly answer these 
criticisms, and Drupal 6.0, which may be released in September, goes even further.!!”! 

Some programmers critize Drupal because they perceive it as not being OOP, but Drupal 
programming from an object-oriented perspective explains how OOP and AOP principles apply 
to Drupal. 


Distributions 


Drupal 4.2 "3! was the basis for DeanSpace, a content management system used to power 
many independent websites supporting the 2004 presidential campaign of Howard Dean. After 
the Dean campaign ended, the DeanSpace project grew into CivicSpace, a Drupal-based 
"grassroots organizing platform that empowers collective action inside communities and 
cohesively connects remote groups of supporters." CivicSpace!! includes CiviCRM and other 
features useful on websites for nonprofit organizations and political campaigns. 

There are several other customized Drupal distributions. Most are simply Drupal repackaged 
with third-party modules, but some also include modifications to the core. An example of such a 
distribution is vbDrupal, which is Drupal integrated with vBulletin. 


Books 


Drupal has been discussed in several books: 

e Pro Drupal Development (April 2007) by John K. VanDyk and Matt Westgate 
(ISBN 1590597559) 

e Drupal: Creating Blogs, Forums, Portals, And Community Websites by David 
Mercer (ISBN 1904811809) 

e Building Online Communities With Drupal, phpBB, and WordPress by Robert T. 
Douglass, Mike Little, and Jared W. Smith (ISBN 1590595629) 

e The revolution will not be televised: democracy, the internet, and the overthrow of 
everything by Joe Trippi (ISBN 0-06-076155-5) 

e The power of many: how the living web is transforming politics, business, and 
everyday life by Christian Crumlish ISBN 0782143466) 

e We the media: grassroots journalism by the people, for the people by Dan Gillmor 
(ISBN 0-596-00733-7) 
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e Drupal. Community-Websites entwickeln und verwalten mit dem Open Source-CMS. 
(German) by Hagen Graf (ISBN 3827323215) 


See also 


e List of content management systems 


e107 


Developer: 
e107 Development Team 


Latest release: 


0.7.8 / February 17, 2007 

OS: 

PHP-compatible -- Linux preferred 
Genre: 

Content management systems 


License: 


GNU General Public License 
Website: 


e107.org 


e107 is an open source content management system (CMS) that allows for the quick creation 
and management of websites or community portals. Built using PHP and database support via 
MySQL, it can be used for websites or for local intranet pages, it currently has support for 
several languages available as additional downloads. 

Its name is derived from the fact it was the 7th main project the creator had worked on. 

e107 is released under the terms of the GNU General Public License. 


History 
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e107 takes its origin from code originally used on the LiteStep websites. Jalist who was 
responsible for the development of those websites, wanted to re-use some of the code from 
litestep.net and Is2k.org and built a more modular system. This system could then be used as a 
codebase for other people to create their own community driven websites. 

The codebase was maintained solely by jalist until version 0.612 when a development team 
was formed. The development team now develops, maintains and builds releases for the e107 
system. 

In 2006, e107 was nominated by the public as one of the five finalists in The Packt Open 
Source Content Management System Award, the results can be found here packtpub 


Features 


e High performance file-based caching 
e Integrated News system and RSS Feed handling 
e Simple to use and create template system 
e Valid XHTML 1.1 Output 
e Powerful Forums system integrated 
e Easy to use Admin system 
The minimum requirements for e107 are: 
e MySQL (MySQL 3.22 or newer) 
e PHP (version 4.3.0 or above) 
PHP must have been compiled with support for MySQL in order to successfully run e107. 
Apache is recommended for running e107, but any server technology that is compatible with 
PHP and MySQL should work. Works with Windows based servers, IIS, MySQL and PHP. 


Release history 


e February 17, 2007: e107 Release 0.7.8 (Almost exclusively a bug fix release) 
December 7, 2006: e107 Release 0.7.7 (Bug Fixes) 
November 22, 2006: e107 Release 0.7.6 (Enhancements, bug fixes and 
improvements) 
May 23, 2006: e107 Release 0.7.5 
May 4, 2006: e107 Release 0.7.4 (Minor omissions in 0.7.3 corrected) 
May 3, 2006: e107 Release 0.7.3 (Security fixes) 
February 10, 2006: e107 Release 0.7.2 (Security fixes) 
January 18, 2006: e107 Release 0.7.1 (Bug Fixes) 
January 16, 2006: e107 Release 0.7.0 (First official release of e107 version 0.7) 


Note about versions 


The releases were numbered in a standard fashion until 5.4 when it was decided to alter the 
version numbering, the next major version was released as version 0.6, versioning has continued 
in this fashion for all subsequent releases. 
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See also 


e Comparison of content management systems 


References 


e Boomer, Tad (February 2007). Building Websites with e107. Packt Publishing. ISBN 
1904811310. 


eZ Publish 


Developer: 


eZ Systems 


Latest release: 


3.9.2 / May 4, 2007 


OS: 


Cross-platform 


Platform: 


PHP 


Genre: 


Content management system 


License: 


GNU General Public License 


Website: 
http://ez.no/ 


eZ Publish is an open source enterprise content management system. It is developed by the 
Norwegian company eZ Systems and a growing number of users and developers worldwide. eZ 
Publish is available for free download under the GPL licence, as well as under proprietary 
licences with commercial support. eZ Publish aims to support the development of professional 
web applications in PHP. 
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The name eZ Publish is pronounced as "easy publish”. 


Areas of use 


eZ Publish supports the development of professional, customized web applications. Typical 
applications range from a personal homepage to a multilingual corporate website including role- 
based multi-user access, e-commerce functions and online communities."! 

According to eZ Systems, eZ Publish is used for tens of thousands of web applications of 
varying type and size worldwide, among them MIT (specifically the controller's office), Vogue 
magazine, NASA, the US Navy DASN and the Swiss public broadcasting organisation 
Schweizer Fernsehen.”!*! Further examples eZ Publish-based projects are listed on the eZ Publish 
reference page. 


Handling 


eZ Publish is managed via a Web browser, thus additional local software is not necessary. It 
also features a rich text editor that allows formatting content similar to a word processor (e.g. 
Word). This enables content editing and contribution without HTML skills. Content management 
can also be done through the eZ Publish frontend. 


Business model 


By providing the software for free, eZ Systems follows an approach called "best of both 
worlds".!! eZ Publish may be used and modified according to the GPL. In addition, for-fee 
professional support is available, which includes an automatic maintenance service (eZ Network) 
and a bugfix guarantee. Furthermore, a "Professional Licence" is available, granting the right to 
use eZ Publish under licence conditions different from the GPL. With this model, eZ Systems 
strives to combine the advantages of commercial and free software. The company has received 
several awards for this business model, including the "Norwegian Prize for promoting Free 
Software"! and was named by EContent as one of the 100 most influental companies in the IT 
industry.' 

eZ Systems has a network of certified partner companies that perform application 
implementations. Currently there are approximately 170 partners worldwide (as of February 
2007, list). The company employs about 80 people in Norway, Ukraine, France, Canada and 
Germany.'”! According to eZ Systems, its developer community includes tens of thousands of 
programmers.'! 


Functional range 


The eZ Publish functional range targets the quick, professional and secure realization of web 
applications. Functional criteria are (besides standards such as sitemaps, search and printing 
function): 

e a logic for content versioning; 
e amedia library; and 
e role-based rights management. 
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Furthermore, custom changes can be made to eZ Publish. For this, the system's architecture 
provides "Extensions", which are meant to contain individual functions. This allows for the 
upgrading of the kernel even after customizing new versions. Finally, there are several hundred 
contributions provided by the community. eZ Systems integrates such contributions into the 
kernel on a continuous basis. This is done especially to avoid mixed installations of the kernel 
and custom plugins, which could lead to serious problems (e.g. for migrating an existing 
installation to new versions of PHP, as such plugins are usually supported unpredictably). 


Technology 


As a LAMP application, eZ Publish is based on PHP. The recommended webserver is 
Apache. This makes the software independent from the operation system. eZ Publish can be run 
on Windows as well as on different UNIX derivatives. 

One of the strictly applied development principles is a clean implementation of the database 
abstraction layer, which enables the use of nearly any common database by using drivers, thus 
rendering changes to the kernel unnecessary. eZ Systems recommends MySQL, but drivers for 
PostgreSQL, Microsoft SQL Server and Oracle are also available. As eZ Publish supports open 
standards such as XML and SOAP, it can be flexibly integrated into existing IT infrastructures. 

eZ Publish is cluster-ready and enforces the strict separation of information and design 
through XML storage of all content. This eases media-neutral design in terms of accessibility, 
e.g. for Braille devices or serving WAP browsers and mobile phones. 


eZ Components 


Future versions of eZ Publish will be based on eZ Components." eZ Components is a library 
of standardized modules for speeding up application development. Among others, there are 
functions for compressing binary files, optimizing performance through caching, connecting to 
several databases, debugging, RSS, generating graphs for analysis, converting images, 
supporting email and validating user input. 

eZ Components are conceptualized as an alternative for the PEAR project. They are 
published under the BSD licence. Nevertheless, commercial support is available from eZ 
Systems. In addition, eZ Systems is developing "eZ Platform", which will be a development 
environment for PHP 6."°! 


Criticism 
eZ Publish has been criticized for the lack of a built-in WYSIWYG editor and rather 


complicated licensing [1]. With its Online Editor being included by default since version 3.8.1 
(June 22, 2006), this has addressed the WYSIWYG editor concern. 


References 
e Paul Borgermans, Tony Wood, Paul Forsyth, Martin Bauer, Bjorn Dieding, Ben 


Pirt: Learning eZ Publish 3: Building Content Management Solutions, Packt 
Publishing 2004, ISBN 1904811019 
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e Balazs Halasy, eZ publish basics, eZ press 2006, ISBN 82-92797-00-9 


See also 


e List of content management systems 
Joomla! 


Developer: 
The OSM Development Team 
Latest release: 
1.0.13 / July 21, 2007 
Preview release: 
1.5 RC1 / July 21, 2007 
OS: 
Cross-platform 
Genre: 
Content Management System 
License: 
GPL 
Website: 
www.joomla.org 
Joomla! is a free, open source content management system written with PHP for publishing 
content on the world wide web and intranets, using the MySQL database. Joomla! includes 
features such as page caching to improve performance, RSS feeds, printable versions of pages, 


news flashes, blogs, polls, website searching, and language internationalization. Joomla! is 
licensed under the GPL, and is the result of a fork of Mambo. 


History 
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Joomla! came into being as the result of a fork of Mambo between Miro Corporation of 
Australia, the trademark holder of the Mambo name at that time, and all of the then-core 
developers. The two groups parted ways on August 17, 2005. The Miro Corporation formed a 
non-profit foundation with the stated purpose to fund the project and protect it from lawsuits." 
The development team claimed that many of the provisions of the foundation structure went 
against previous agreements made by the elected Mambo Steering Committee, lacked the 
necessary consultation with key stake holders, and included provisions that violated core open 
source values."”! 

The development team created a web site called OpenSourceMatters to distribute information 
to users, developers, web designers, and the community in general. The project team leader at the 
time Andrew Eddie, also known as "MasterChief," wrote an open letter to the community”! 
which appeared on the announcements section of the public forum at mamboserver.com. 

One thousand people had joined the opensourcematters.org forum web site within a day, 
most posting words of encouragement and support for the actions of the Development Team. The 
web site received a slashdotting and news articles regarding the event appeared at 
newsforge.com, eweek.com, and ZDnet.com. Miro CEO Peter Lamont gave a public response in 
an article entitled "The Mambo Open Source Controversy - 20 Questions With Miro"."*! 

This event has stirred deeply held feelings in the free software community regarding what 
shall constitute "open source". Forums at many other open source projects were active with 
postings for and against the actions of both sides. Rumor and accusations of wrongdoing by Miro 
and the Mambo Foundation were rampant. 

In the two weeks following Eddie's announcement teams were re-organized and _ the 
community continued to grow. On September 1, 2005 the new name, "Joomla!", which is the 
English spelling of the Swahili (and Urdu: ,EDA) word jumla meaning "all together" or "as a 
whole", was announced to a mixed reception of 3000+ faithful followers of the Development 
Team. It was chosen to reflect the commitment of the development team and community to the 
project. 

The first release of Joomla! (Joomla! 1.0.0) was announced on September 16, 2005. This was 
a re-branded release of Mambo 4.5.2.3 combined with other bug and moderate-level security 
fixes. In the project's roadmap, the core developers say Joomla! 1.5 will be a completely re- 
written code base built with PHP 5. It was announced in 2006 and has been nominated for the 
vaporware award 2007. 

Joomla! won the Packt Publishing Open Source Content Management System Award in 
2006." 


Features 


The Joomla! package consists of many different parts, which are built to be as modular as 
possible, allowing extensions and integrations to be made easily. An example of such are 
extensions called "Plugins".!(Previously known as "Mambots".) Plugins are background 
extensions that extend Joomla! with new functionality. The WikiBot, for example, allows the 
author of Joomla! content to use "Wikitags" in Joomla! articles which will auto-create dynamic 
hyperlinks to Wikipedia articles when displayed. There are over 1,900 extensions for Joomla! 
available via the Extensions Directory, a site that OpenSourceMatters runs as an official 
directory of extensions."”! 
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In addition to Plugins, more comprehensive extensions are available. "Components" allow 
webmasters to perform such tasks as build a community by expanding user features, backup a 
website, and create URLS that are more friendly to search engines.'! "Modules" perform such 
tasks as displaying a calendar or allowing custom code to be inserted within the base Joomla! 
code." 


Community 


Joomla! has an official and many unofficial communities. As of July 2007, the official 
Joomla! forums claims 178,000 threads and over 920,000 posts from over 120,000 members in 
40 languages."""! Unofficial sites are published in many languages, often with Joomla! extensions 
that are region specific. Bi-directional text support for the Hebrew and Arabic languages, for 
example, can be found on 3rd party community portals. Unofficial web developers also build 
extensions and web templates for commercial sale and offer freelance customization services. 


See also 


e List of content management systems 


Literature 


e Pirtle, Mitchell (August 7, 2006). The Definitive Guide to Joomla!. Apress. ISBN 1- 
59059-571-8. 

e Graf, Hagen (February 22, 2006). Building Websites with Joomla!. Packt 
Publishing. ISBN 1904811949. 

e Graf, Hagen (February 22, 2007). Building Websites with Joomla! 1.5 Beta 1. Packt 
Publishing. ISBN 1847192386. 

e LeBlanc, Joseph (May 2007). Learning Joomla! Extension Development: Creating 
Modules, Components, and Plugins with PHP. Packt Publishing. ISBN 1847191304. 

e North, Barrie (April 2007). The Joomla Admin Manual: A Step by Step Guide to a 
Successful Website. LuLu. ISBN 9780615146751. 

e Rahmel, Dan (July 25, 2007). Beginning Joomla!: From Novice to Professional. 
Apress. ISBN 1590598482. 

e Rahmel, Dan (2007). Professional Joomla!. Wrox. ISBN 978-0-470-13394-1. 


LifeType 


Developer: 


Project members 


Latest release: 


1.2.3 / June 8, 2007 
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OS: 


Cross-platform 


Genre: 


CMS/Blog software 


License: 


GPL 


Website: 


www lifetype.net 


LifeType is an open-source blogging platform with support for multiple blogs and users in a 
single installation. It is written in PHP and backed by a MySQL database. LifeType is licensed 
under the GNU General Public License. 

LifeType began as a project to create a flexible and powerful blogging platform for both 
users and developers. LifeType was built with the intention to have a scalable architecture while 
keeping the core as light as possible to ensure an efficient and fast web-publishing engine. This 
mindset has been sustained whilst adding several crucial features that have enabled LifeType to 
be comparative to other more established blog software. LifeType borrows from several Open- 
Source components that have been integrated to ensure that only the best, most suitable code is 
used in the LifeType framework. These functional components together help formulate a full- 
featured blogging solution for site owners to deploy as a multi-user/multi-blog hosting service 
and also for everyday bloggers to enjoy the ease of publishing content with LifeType. 

LifeType offers support for multiple blogs and multiple users in one installation. Each blog 
can be run in its own language and can be customized via a powerful template engine. Moreover, 
LifeType is the only one of its kind to come with a fully featured anti- spam mechanism based on 
Bayesian filtering techniques, to help combat the dreaded blog spam. Media uploads and file 
handling, fully customizable search engine, friendly URLs and a thorough, easy to use 
administration area are also amongst the core features of LifeType. 

For simplicity Lifetype integrates a web based wizard to help users to setup and configure 
their own blog. Lifetype can be configured as a single user or multi-user interface. 

With the integration of the template editor plugin, users can make custom adjustments to 
Lifetype's templates. As an open source project, Lifetype is built, maintainted and adjusted by 
the Lifetype community of designers. 


Features 


e Multiple Blogs per Installation (suitable for blog hosting) 
e Subdomains 
e Multiple Users per Blog 
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e Integrated Media Management (Podcasting, automatic thumbnail generation, 
mass uploading of files, a filebrowser and custom descriptions for each file) 

e Extensible plugin support 

e Anti-Spam features (bayesian spam filter, comment moderation, Captcha, 


trackback validation) 
e Localization 
e XMLRPC 


e Mobile features (moblogging) 


History 


The LifeType project started in February 2003 under the name pLog. pLog was renamed to 
LifeType after Amazon.com, the holder of the "plog" trademark requested it to change its name. 


Mambo 


Developer: 


Mambo Foundation Inc. 
Latest release: 
Latest releases / April 24, 2007 


OS: 


Cross Platform, PHP-compatible -- Linux preferred 


Genre: 


Content management system 


License: 


GNU General Public License v.2 


Website: 


mambo-foundation.org 


Mambo (formerly named Mambo Open Source or MOS) is a free software / open source 
content management system (CMS) for creating and managing websites through a simple web 
interface. It has attracted many users due to its ease of use. Mambo also includes more advanced 
features such as page caching to improve performance on busy sites, advanced templating 
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techniques, and a fairly robust API. It can also automate many tasks such as web indexing for 
static pages. Mambo can provide RSS feeds, printable versions of pages, news flashes, blogs, 
forums, polls, calendars, website searching, language internationalization, and other possibilities. 

Mambo is released under the terms of the GNU General Public License (GPL)Version 2. 

The rights to the Mambo CMS codebase, name and copyrights, are protected by the Mambo 
Foundation, a non-profit corporation formed to support and promote the Mambo Open Source 
project. 

The project site and downloads can be found at http://source.mambo-foundation.org, and an 
online demo of Mambo is available at Open Source CMS Demo 


Requirements 


As of 08 June 2006, the minimum requirements for Mambo consist of: 
e Apache (version 1.3.19 or above) 
e MySQL (version 3.23.x or above) 
e PHP (version 4.2.x or above) 

Mambo is thoroughly tested on Linux, FreeBSD, Mac OS X and Windows NT/2000/XP. 
Linux or one of the BSD's are recommended, but anything else that can run the three pieces of 
software listed above should work. Additionally, Mambo displays properly in most major 
browsers: Internet Explorer (version 5.5+), Netscape, Opera and Mozilla/Firefox. 

PHP must have been compiled with support for MySQL and Zlib in order to successfully run 
Mambo. There are reports that Mambo works on Microsoft IIS, but Apache is recommended for 
running Mambo on Windows. 


Mambo Foundation 


The Mambo Foundation is a non-profit entity established under the laws of Australia. The 
Foundation is based on Eclipse and GNOME and is controlled by the members of the Foundation 
via an elected Board of Directors. The Mambo Foundation's brief is to foster the development of 
the Mambo system and to shelter the project from threats and misuse. The Foundation was 
formed in August 2005. The Mambo Foundation also has forums addressing the latest issues 
within Mambo, including the deployment of Mambo 5. 


History of Mambo 
2000: 

Miro Construct Pty Ltd, registered in March 2000 in Melbourne, and headed up by CEO 
Peter Lamont, a former advertising executive, starts development of Mambo, a closed-source, 


propr ietary content management system. 


2001: 


181 


NICOLAE SFETCU: WEB DESIGN & DEVELOPMENT 


The company adopted a dual licensing policy, releasing Mambo Site Server under the GPL 
on Sourceforge in April 2001. 

CEO Peter Lamont is quoted as saying at that time that the code was released to open source 
with the intention of gathering source contributions to improve it, until they could release it as a 
commercial product. 

The mamboserver.com domain was registered in May, 2001. 

From this time until the middle of 2002, Miro was the only developer of Mambo, 
contributing bug-fixes and security patches but not really extending the code or adding to the 
feature sets. 


2002: 


Miro releases the commercial CMS called Mambo 2002. 

With version 3.0.x, the open source Mambo Site Server becomes "Mambo Open Source" 
(commonly referred to as "MOS"). 

Robert Castley becomes Project Director of Mambo Open Source. 

By the end of 2002, Robert Castley had pulled together a volunteer team of developers. 

Mambo Open Source 4.0 is released. 


2003: 


Early in 2003, Miro hands off the responsibility of the code fully to the Open Source project 
Development Team. Miro concentrates on its commercial products and Mambo Open Source 
goes from strength to strength under the leadership of Robert Castley. 

Miro released Mambo CMS, a commercial version of Mambo Open Source. Miro claims that 
Mambo CMS does not contain any source added to Mambo after it was made open source. 

Miro Construct Pty Ltd goes into voluntary liquidation in February 2003 and in August, Miro 
International Pty Ltd is formed. Source code for Mambo Open Source shows copyright 2000 - 
2003 Miro Construct Pty Ltd. 

In late 2003, Mambo was targeted by legal threats concerning the intellectual property rights 
to certain pieces of code contained in the core. The problem was severe and cost money, man 
hours, and eventually the loss of some key community leaders. Miro came to the aid of Mambo, 
offering legal and corporate resources to protect the development team and preserve the program. 

Mambo Open Source 4.5 released in December 2003. By this time, almost all of the original 
Miro code had disappeared during refactoring. 


2004: 


mamboforge.net starts in March, 2004. 

Linux Format awards Mambo “Best Free Software Project” of the Year. 

Linux User and Developer names it “Best Linux or Open Source Software”. 

In mid-2004, the name "Mambo Open Source" (which was commonly referred to as MOS) 
was changed to just "Mambo", causing concern in the community over apparent confusion this 
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would cause between the open source, community-developed CMS and Miro's commercial 
offering, "Mambo CMS". 
September: Mambo version 4.5.1 is released, followed a month later by version 4.5.1a stable. 
Robert Castley resigns as Project Director and in November, Andrew Eddie takes on the role. 
December 2004, the Mambo Steering Committee was established with representatives from 
both Miro and the Mambo development team. This committee was designed to govern the 
Mambo project. 


2005: 


January: Andrew Eddie announces a strategic partnership between Mambo and Miro 
International Pty Ltd, with Miro proposing to offer financial support for the open source project, 
plus training, commercial support services, and developer certification. 

February: Discussions begin over the formation of a non-profit foundation for the Mambo 
project. 

Versions 4.5.2 and later, 4.5.2.1 are released. 

The commercial Mambo CMS is renamed "Jango" in April, 2005. 

“Best Open Source Solution” and “Best of Show - Total Industry Solution” at LinuxWorld 
Boston. 

“Best Open Source Solution” at LinuxWorld San Francisco. 

July: mambo-foundation.org domain is established. 

August: Mambo Foundation, Inc is legally constituted on 8th August, 2005. Miro CEO, Peter 
Lamont, appoints himself President of the Board of the new Foundation. 

12th August: Robert Castley, who is an inaugural member of the Mambo Foundation Board 
of Regents, states: "The Foundation allows for everything to be placed outside of Miro incl. 
Domain Names, hosting etc. " and goes on to say that with him, the original founder of Mambo 
Open Source, and Andrew Eddie both being on the Board of the Mambo Foundation, Mambo 
would continue as a successful, open source project. He concluded his statement with,"So there 
you have it two very key people in the overall success of Mambo are at the helm. Trust me, 
Mambo is in very, very safe hands!" 

A few days later, the entire team of core programmers publicly announced they had 
abandoned Mambo and shortly after this, Robert Castley steps down from the Board of Regents. 

The former core development team members regroup under the name "Open Source Matters" 
and the open source community at mamboserver.com fractures over allegations that the Mambo 
Foundation was formed without community input and with insufficient developer control. People 
express suspicion over the level of involvement by Miro International. By the end of August, the 
new project is named Joomla! and most of the former Mambo community has relocated to Open 
Source Matters. By the end of September, Open Source Matters Inc is a duly constituted non- 
profit corporation registered in New York. 

Joomla! positions itself as a "rebranding of Mambo" and releases its first fork of Mambo as 
Joomla 1.0 in September, 2005. The two code-bases are almost identical at this stage. 

Mambo forms a new core development team with Martin Brampton appointed as Core 
Development Team leader. 

Miro assigns all rights in the copyright of Mambo to the Mambo Foundation. 
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November: Mambo version 4.5.3 is released. 
December: Miro International Pty Ltd is voluntarily deregistered as a company on 31st 
December, 2005. 


2006: 


January: The rights to Miro International Pty Ltd are sold by Peter Lamont and a new 
business entity called Miro Software Solutions is created. Miro Software Solutions continues to 
develop Jango and other proprietory software under new ownership. 

March: Mambo named "Best Open Source Software Solution" at LinuxWorld Australia. 

April: Core developer team leader, Martin Brampton, resigns and leaves the project. 

Chad Auld takes over the role as Core Developer Team leader. 

Mambo 4.5.4 is released. 

July: The Mambo Foundation websites become independent from Mambo Communities Pty 
Ltd. 

August: following elections, the new Board of the Mambo Foundation takes office. The 
Mambo Foundation is now completely independent of any corporate interest. 

September: Mambo version 4.6 is released. 

October: Mambo 4.6.1 is released. 


2007: 


February: Mambo 4.5.5 is released. This is a maintenance release for the 4.5.x branch. 
April: Mambo 4.6.2 is released. This is a maintenance release for the 4.6.x branch and 
enables localisation of Mambo. 


Awards 


e "Best Free Software Project of the Year" - Linux Format Magazine, 2004 
e "Best Linux or Open Source Software" - LinuxUser & Developer 2004 

e "Best Open Source Solution" - LinuxWorld, Boston 2005 

e "Best of Show - Total Industry Solution" - LinuxWorld, Boston 2005 

e "Best Open Source Solution" - LinuxWorld, San Francisco 2005 

e "Best Open Source Solution" LinuxWorld, Sydney, Australia 2006 


See also 


e List of content management systems 
PHP-Nuke 


Latest release: 


8.1 
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OS: 


Cross-platform 


Genre: 


Content Management System 


License: 


GNU General Public License 


Website: 


www.phpnuke.org 


PHP-Nuke is a web based automated news publishing and content management system (a 
"nuke") based on PHP and MySQL. The system is fully controlled using a web-based user 
interface. PHP-Nuke was originally a fork of the Thatware news portal system. 

The system is released as free software under the GNU General Public License. Until version 
7.5, the latest version of the software was freely downloadable from the PHP-Nuke website; 
version 7.5 is the first version for which a US$10 download charge is made. This is permitted 
under the GNU GPL (providing the source code is included), but the purchaser of the software 
has the right to freely distribute the source code of the product. This charge only applies to the 
latest version of PHP-Nuke; previous versions are available free of charge. Version 8.1 is being 
distributed for US$12. 

Starting with version 5.6, the display of a copyright message on webpages is required 
according with the GPL license section 2(c) and approved by the Free Software Foundation as 
explained on the official web site [1]. 

It requires a web server which supports the PHP extension (such as the Apache HTTP 
Server), as well as an SQL database (such as MySQL, mSQL, PostgreSQL, ODBC, Adabas, 
Sybase or InterBase). 


Features of PHP-Nuke 


The main aim of PHP-Nuke is to allow a webmaster to create a community-based portal 
(similar to that used by Slashdot), with an automated web site allowing users and editors to post 
news items (user-submitted news items are selected by editors). Users can comment on these 
articles using the comments system. 

Modules may be added to the PHP-Nuke system, allowing the webmaster to add more 
features (such as an Internet forum or calendar) to their PHP-Nuke installation in addition to the 
core modules such as News, FAQ and Private Messaging. The whole system is maintained by 
administrators using the web-based administration section. 

PHP-Nuke is able to support a number of different languages. Its look and feel can also be 
customized (to an extent), using the Themes system, although some people have found it difficult 
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to make the site look any different from the standard T column layout (as used by the program's 
official website [2]). 


Lack of security and criticism 


PHP-Nuke has in the past been criticized for containing many security holes. SQL injection 
is one of the most widely-known flaws in PHP-Nuke's security [3], although other methods of 
gaining access to the admin panel of a site running PHP-Nuke have been found [4][5]. In 
February 2005, the Webmail module of PHP-Nuke was removed from all versions due to 
security problems at the request of EV1Servers.net, phpnuke.org's web hosting service [6]. Since 
it contains a port of phpBB2, it also inherits phpBB's security flaws. 


Forks 


The PHP-Nuke project has been forked many times by different people, for different reasons. 
Notable forks include: 
e Dragonfly CMS — based on PHP-Nuke 6.5 with Coppermine Photo Gallery 
included 
e Nuke-Evolution — Based on PHP-Nuke with many security fixes, custom 
modifications, and PHP-Nuke modifications pre-installed 
e openPHPNuke — based on myPHPNuke 
e phpWebSite — project managed by the Web Technology Group at Appalachian 
State University 
e Postnuke — based on PHP-Nuke 5.0 
e XOOPS — based on PHP-Nuke and myPHPNuke 
e Xaraya — based on Postnuke 
These forks, and others, have all had their own ideas regarding several aspects of the system, 
and thus do many things in different ways in an attempt to produce a better product. 
Nevertheless, most, if not all of these systems can be used to produce a community portal similar 
to that which can be built using PHP-Nuke, although it is possible that these systems are better at 
doing some things than PHP-Nuke (and vice-versa). 


PostNuke 


Latest release: 


0.764 / November 20, 2006 


Preview release: 


0.800 Milestone 2 / September 29, 2006 


OS: 
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Cross-platform 


Genre: 


Content Management System 


License: 


GNU General Public License 


Website: 


www.postnuke.com 


PostNuke is a free, open source content management system (a 'nuke') forked from PHP- 
Nuke, released under the GNU General Public License. The PostNuke project was started 
because some PHP-Nuke users believed there should be a more open development environment. 
The users felt the only way this could be done was to develop their own fork of the PHP-Nuke 
code. PostNuke has grown beyond PHP-Nuke and become a project standing on its own merits. 
Like XOOPS, PostNuke's code is object oriented and fully modular. 


Features 


Currently PostNuke can provide RSS feeds, internet forums (through PNphpBB2, a 
modification of the popular forum software phpBB, as well as pnForum and other forums), polls, 
language internationalization, galleries, chat, e-commerce, games, reviews, links, as well as 
countless other features with various hooks and modules, many of which are free. PostNuke is 
also themeable using several theme engines such as Xanthia and AutoTheme (also known as AT, 
or AT-Lite), as well as PostNuke's native theming system. Both Xanthia and AutoTheme offer 
page caching to speed up page generation times. PostNuke is written in PHP and stores its data in 
a MySQL database. Future versions will work with any database supported by the ADOdb 
driver. PostNuke will run on any platform that PHP will run on, including the Apache and IIS 
webservers. 


Books 


e Postnuke Content Management by Kevin J. Hatch (ISBN 0672326868) 
e Das PostNuke Kompendium by Markus Gossmer, Michael Schumacher, Andreas 
Schauperl (ISBN 3540219420) 


Current Development 
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The current version of PostNuke is 0.764. The development is currently focusing on the 
release of version 0.8, with the Milestone 3 available for download. The 0.7x version is now only 
updated for security fixes, with 0.8x at the centre of development. 


See also 


e List of content management systems 
TYPO3 


Developer: 


TYPO3 Association 


Latest release: 
4.1.2 / July 16, 2007 
OS: 

Cross-platform 
Genre: 


Content Management System 


License: 


GPL 


Website: 


www.typo3.com 


www.typo3.org 


(Developer Resource) 


TYPOS is a free, open source content management system written with PHP and MySQL. It 
has become one of the leading content management systems (CMS) on the web and in intranets. 
TYPOS3 is not to be confused with Typo, which is an open source blogging engine. 
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Author and availability 


Initially authored by Kasper Skarhgj, it is available for free and licensed under the GNU 
General Public License. Today, the core of TYPO3 is developed by two teams in a maintenance 
tree (4.x, leader Michael Stucki) and a development tree (5.x, leader Robert Lemke). 
Independent authors have contributed hundreds of pluggable extensions. 

TYPO3 offers full flexibility and extendability while featuring an accomplished set of ready- 
made interfaces, functions and modules. Nearly 2000 extensions are available for download 
under the GNU General Public License from a repository called the TYPO3 Extension 
Repository, or TER. 


Operating modes 


TYPO3 has two operating modes: a frontend, which presents a TYPO3 based website to its 
users, and a backend, used by authors and site administrators to manage content for the website. 
The backend is used with an ordinary web browser, so there is no special software needed to 
work on a TYPO3-based website. 


Design 


The system is based on templates. People can choose an existing template and change 
features such as logo, colors, and fonts, or they can construct their own templates using a 
configuration language called TypoScript. In this simple notation, a large object tree structure of 
information is created with placeholders for data from the database to fill in. Preprogrammed 
objects are configured by changing or adding values or features. This object tree structure is 
saved in a text file. Various editors are available to change the content in a structured way. The 
PHP generator uses this data structure to drive the generation of the content. Besides conditions, 
TypoScript has no other control structures; if real processing must be performed, it is handed to a 
PHP function. The top level object is the PAGE object. There are various types of MENU 
objects. 


Syntax of TypoScript 


The basic syntax is: 


[objectpath].[attribute] [operator] [value] 


Operators are 
e =assignment ofa value 
e <copy a whole object 
e =< insert areference 
e >remove object 
Examples: 
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myObject.attribute1=Hello 


The content is mainly stored in two tables: one table called pages and another called 
tt_content, which holds the elements contained in a page. Every page object has a unique 
identification key (uid) and is linked to a parent page. Thus, the pages are organized in tree form, 
allowing the system to easily generate menus and site maps. 

One key feature of TYPO3 is that it has a flexible application programming interface which 
enables independent developers to provide extra functionality. Modules using this API are called 
"extensions" and many developers provide their extensions in a publicly available repository on 
the TYPO3 developer portal. 


TemplaVoila 


TemplaVoila is an alternative template engine extension for TYPO3. Features include a 
graphical mapping tool for creating templates, a new page module, the ability to create flexible 
content elements and an API for developers. New content element types can be created without 
programming in no time while full control over the HTML output is provided. TemplaVoila 
facilitates more flexibility for maintaining web pages while making it possible to enforce a strict 
corporate design and allowing editors to work with content more intuitively. 


System requirements 


Web server |! 

e Operating System: Unix (eg Linux), Windows or Mac 

e Web server: Apache, IIS 

e Middleware: PHP4 or PHPS (starting with TYPO3 4.2.0, all releases will require 
PHPS) 

e Database: MySQL or any other database system supported by the TYPO3 DBAL 
(e.g. Oracle, PostgreSQL and a lot of others) 

e Hardware: A normal web server setup will do, with some modern CPU and at 
least 256 MB RAM. As with all database-driven applications, more RAM is 
advisable though. 

Client (user) / 

e Recommended: Mozilla Firefox on any OS, Internet Explorer 5+ on Windows 

e Required: Any graphical browser on any OS (Internet Explorer, Opera, Safari, 
Konqueror on Windows, Unix, Mac) 

e Hardware: Relatively modern computer 


See also 


e Comparison of content management systems 


Further reading 
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e Rene Fritz, Daniel Hinderink, Werner Altmann - TYPO3: Enterprise Content 
Management (Paperback) - ISBN 1-904811-41-8 

e Michael Peacock - Building Websites with TYPO3 (Paperback) - ISBN 1-847191- 
11-8 


XOOPS 


Stable release: 

2.0.16 (2006/11/17) [+/-] 
Preview release: 

2.2 () [+/-] 
Use: 

content management system 
License: 

GNU General Public License 
Website: 


XOOPS Official Site 

XOOPS is a free content management system, written in PHP, for websites. It uses a 
modular architecture allowing users to customize, update and theme their websites. XOOPS is 
released under the terms of the GNU General Public License (GPL) and is free to use, modify 
and redistribute. XOOPS is supposed to be pronounced phonetically as it would be in English, as 
"Z00'ps". 

The XOOPS project was awarded First Runner-Up status in the SourceForge.net's 2006 
Community Choice Awards in the Development category. 


Overview 


XOOPS is an acronym of eXtensible Object Oriented Portal System. Though started as a 
portal system, XOOPS is developing as a content management system. It aims to serve as a web 
framework for use by small, medium and large sites, through the installation of modules. For 
example, a small XOOPS installation can be used as a personal weblog or journal, but this can be 
expanded upon and customised, for example users might add the appropriate modules (freeware 
and commercial) to store content in news, forums, downloads, and more. 

Steve Atwal wrote a book called Building Websites with Xoops (ISBN 1904811280) about 
XOOPS. 


Key features 
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Community 
Because XOOPS is released under the terms of the GNU General Public License (GPL) the 
growth and development of XOOPS is dependent on the contributions of a worldwide 
community effort. 

Database 
XOOPS uses a relational database (currently MySQL) to store data required for running 
a web-based content management system. 

Advanced access control layer 
Webmasters have the ability to grant specific groups of users specific access rights to 
content and features such as edit, delete, upload, add attachments, publish content, and 
so on. 

Modularized 
Modules can be installed, uninstalled, activated, and deactivated using the module 
administration system. 

Core features usable by modules 
XOOPS possess interesting core features that can be used by modules like permissions, 
comments, notifications and blocks features. 

Personalization 
Webmasters have the ability to set specific permissions for user access and control of 
individual elements of their websites. 

User Management 
The ability to search for users by various criteria, send email and private messages to 
users through a template-based messaging system. 

International language support 
The XOOPS community has official support sites around the world for non-English- 
speaking users. Additionally XOOPS itself supports multi-byte character sets for 
languages that use characters not in the Latin alphabet, for example Japanese, Simplified 
and Traditional Chinese, Korean, etc. 

Theme-based skinnable interface 
XOOPS uses themes for page presentation. Both administrators and users can change 
the look of the entire web site by selecting from available themes. 

Templates 
XOOPS uses the Smarty templating engine which allows many benefits such as easier 
separation between business logic and presentation logic as well as content caching. 

Built-in LDAP authentication 

A lot of SEO add-on 


See also 


e List of content management systems 


Xaraya 
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Xaraya is a web application framework and content management system, written in PHP and 
released as free software under the GNU General Public License. It was originally a fork of the 
original 0.8 development branch of Postnuke, but has been re-written from the ground up since 
then. 


Features 


e Modular architecture. 

e Widely re-usable code. 

e Template-controlled presentation. 

e Possibility to redesign and re-export the data structure of modules, using the 
Dynamic Data Model. 

e Support for different article types (such as News, Documents, Links, Reviews, 
FAQ, etc.), and the possibility to easily add new article types from the 
administration area. 

e Modular authentication, allowing Xaraya to support authentication methods 
such as LDAP, PAM. A working LDAP module is shipped by default with Xaraya, 
and new plugin systems are being developed. 

e Anadvanced permissions system. 

e Built in short URL support in many modules. 

e Template and output caching. 


Criticism 


e Because of its abstract structure, Xaraya has high performance issues and is 
dependent on caching for all page serving. 

e There is a very steep learning curve in becoming familiar with Xaraya due to its 
flexibility. 

e There is little support for custom modules, although the software is actively 
supported through Xaraya.com[1]. 


See also 


e Comparison of content management systems 


Version control systems 


Revision control (also known as version control, source control or (source) code 
management (SCM)) is the management of multiple revisions of the same unit of information. 
It is most commonly used in engineering and software development to manage ongoing 
development of digital documents like application source code, art resources such as blueprints 
or electronic models and other critical information that may be worked on by a team of people. 
Changes to these documents are identified by incrementing an associated number or letter code, 


" " 


termed the "revision number", "revision level", or simply "revision" and associated historically 
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with the person making the change. A simple form of revision control, for example, has the 
initial issue of a drawing assigned the revision number "1". When the first change is made, the 
revision number is incremented to "2" and so on. 

Software tools for revision control are increasingly recognized as being necessary for the 
organization of multi-developer projects. "!! 


Overview 


Engineering revision control developed from formalized processes based on tracking 
revisions of early blueprints or bluelines. Implicit in this control was the ability to return to any 
earlier state of the design, for cases in which an engineering dead-end was reached in the 
development of the design. Likewise, in computer software engineering, revision control is any 
practice that tracks and provides control over changes to source code. Software developers 
sometimes use revision control software to maintain documentation and configuration files as 
well as source code. Also, version control is widespread in business and law. Indeed, "contract 
redline" and "legal blackline" are some of the earliest forms of revision control, and are still 
employed with more or less sophistication. An entire industry has emerged to service the 
document revision control needs of business and other users, and some of the revision control 
technology employed in these circles is subtle, powerful, and innovative. The most sophisticated 
techniques are beginning to be used for the electronic tracking of changes to CAD files (see 
Product Data Management), supplanting the "manual" electronic implementation of traditional 
revision control. 

As software is designed, developed and deployed, it is extremely common for multiple 
versions of the same software to be deployed in different sites, and for the software's developers 
to be working simultaneously on updates. Bugs and other issues with software are often only 
present in certain versions (because of the fixing of some problems and the introduction of others 
as the program develops). Therefore, for the purposes of locating and fixing bugs, it is vitally 
important to be able to retrieve and run different versions of the software to determine in which 
version(s) the problem occurs. It may also be necessary to develop two versions of the software 
concurrently (for instance, where one version has bugs fixed, but no new features, while the 
other version is where new features are worked on). 

At the simplest level, developers could simply retain multiple copies of the different versions 
of the program, and number them appropriately. This simple approach has been used on many 
large software projects. While this method can work, it is inefficient as many near-identical 
copies of the program have to be maintained. This requires a lot of self-discipline on the part of 
developers, and often leads to mistakes. Consequently, systems to automate some or all of the 
revision control process have been developed. 

Moreover, in software development and other environments, including in legal and business 
practice, it is increasingly common for a single document or snippet of code to be edited by a 
team, the members of which may be geographically diverse and/or may pursue different and 
even contrary interests. Sophisticated revision control that tracks and accounts for ownership of 
changes to documents and code may be extremely helpful or even necessary in such situations. 

Another use for revision control is to track changes to configuration files, such as those 
typically stored in /etc or /usr/local/etc on Unix systems. This gives system administrators 
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another way to easily track changes to configuration files and a way to roll back to earlier 
versions should the need arise. 


Compression 


Most revision control software can use delta compression, which retains only the differences 
between successive versions of files. This allows more efficient storage of many different 
versions of files. 


Source management models 


Traditional revision control systems use a centralized model, where all the revision control 
functions are performed on a shared server. If two developers try to change the same file at the 
same time, without some method of managing access the developers may end up overwriting 
each other's work. Centralized revision control systems solve this problem in one of 2 different 
"source management models": file locking and version merging. 


File locking 


The simplest method of preventing "concurrent access" problems is to lock files so that only 
one developer at a time has write access to the central "repository" copies of those files. Once 
one developer "checks out" a file, others can read that file, but no one else is allowed to change 
that file until that developer "checks in" the updated version (or cancels the checkout). 

File locking has merits and drawbacks. It can provide some protection against difficult merge 
conflicts when a user is making radical changes to many sections of a large file (or group of 
files). But if the files are left exclusively locked for too long, other developers can be tempted to 
simply bypass the revision control software and change the files locally anyway. That can lead to 
more serious problems. 


Version merging 


Most version control systems, such as CVS, allow multiple developers to be editing the same 
file at the same time. The first developer to "check in" changes to the central repository always 
succeeds. The system provides facilities to merge changes into the central repository, so the 
improvements from the first developer are preserved when the other programmers check in. 

The concept of a reserved edit can provide an optional means to explicitly lock a file for 
exclusive write access, even though a merging capability exists. 


Distributed revision control 
Distributed revision control takes a peer-to-peer approach, as opposed to the client-server 


approach of centralized systems. Rather than a single, central repository on which clients 
synchronize, each peer's working copy of the codebase is a _ bona-fide repository.”! 
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Synchronization is conducted by exchanging patches (change-sets) from peer to peer. This 
results in some striking differences from a centralized system: 
e No canonical, reference copy of the codebase exists by default; only working 
copies. 
e Common operations such as commits, viewing history, and reverting changes 
are fast, because there is no need to communicate with a central server.!:! 
e Each working copy is effectively a remoted backup of the codebase and change 
history, providing natural security against data loss.!3! 
There are two types of distributed systems: open and closed. Open systems are tuned more to 
open-source development, and closed systems to traditional, single baseline, development. 


Open Systems 


An open system of distributed revision control is characterized by its support for independent 
branches, and its heavy reliance on merge operations. Its general characteristics are: 
e Every working copy is effectively a branch. 
e Each branch is actually implemented as a working copy, with merges conducted 
by ordinary patch exchange, from branch to branch. 
e It may be possible to "cherry-pick" single changes, selectively pulling them from 
peer to peer. 
e New peers can freely join, without applying for access to a server. 
One of the first open systems was BitKeeper, noteable for its use in the development of the 
Linux kernel. A later decision by the makers of BitKeeper to restrict its licensing led the Linux 
developers on a search for a free replacement"!. Common open systems now in free use are: 


e Bazaar e Monotone 
Darcs SVK 
Git e for a full list, see the comparison of revision control 
Mercurial software 
Closed Systems 


A closed system of distributed revision control is based on a Replicated Database. A check-in 
is equivalent to a distributed commit. Successfull commits create a single baseline. An example 
of a closed distributed system is Code Co-op. 


Integration 
Some of the more advanced revision control tools offer many other facilities, allowing deeper 
integration with other tools and software engineering processes. Plugins are often available for 


IDEs such as IntelliJ IDEA, Eclipse and Visual Studio. NetBeans IDE comes with integrated 
version control support. 


Common vocabulary 
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Terminology can vary from system to system, but here are some terms in common usage.'ll6) 

Baseline 
An approved revision of a document or source file from which subsequent changes can 
be made. 

Branch 
A set of files under version control may be branched or forked at a point in time so 
that, from that time forward, two copies of those files may be developed at different 
speeds or in different ways independently of the other. 

Check-out 
A check-out (or checkout or co) creates a local working copy from the repository. 
Either a specific revision is specified, or the latest is obtained. 

Commit 
A commit (check-in, ci or, more rarely, install or submit) occurs when a copy of the 
changes made to the working copy is written or merged into the repository. 

Conflict 
A conflict occurs when two changes are made by different parties to the same 
document, and the system is unable to reconcile the changes. A user must resolve the 
conflict by combining the changes, or by selecting one change in favour of the other. 

Change 
A change (or diff, or delta) represents a specific modification to a document under 
version control. The granularity of the modification considered a change varies between 
version control systems. 

Change list 
On many version control systems with atomic multi-change commits, a changelist, 
change set, or patch identifies the set of changes made in a single commit. This can 
also represent a sequential view of the source code, allowing source to be examined as 
of any particular changelist ID. 

Dynamic stream 
A stream (a data structure that implements a configuration of the elements in a 
particular repository) whose configuration changes over time, with new versions 
promoted from child workspaces and/or from other dynamic streams. It also inherits 
versions from its parent stream. 

Export 
An export is similar to a check-out except that it creates a clean directory tree without 
the version control metadata used in a working copy. Often used prior to publishing the 
contents. 

Head 
The most recent commit. 

Import 
An import is the action of copying a local directory tree (that is not currently a working 
copy) into the repository for the first time. 

Mainline 
Similar to Trunk, but there can be a Mainline for each branch. 

Merge 
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A merge or integration brings together two sets of changes to a file or set of files into a 
unified revision of that file or files. 

e This may happen when one user, working on those files, updates their working 
copy with changes made, and checked into the repository, by other users. 
Conversely, this same process may happen in the repository when a user tries 
to check-in their changes. 

e It may happen after a set of files has been branched, then a problem that 
existed before the branching is fixed in one branch and this fix needs merging 
into the other. 

e It may happen after files have been branched, developed independently for a 
while and then are required to be merged back into a single unified trunk. 

Repository 
The repository is where the current and historical file data is stored, often on a server. 
Sometimes also called a depot (e.g. with SVK, AccuRev and Perforce). 

Reverse integration 
The process of merging different team branches into the main trunk of the versioning 
system. 

Revision 
A revision or version is one version in a chain of changes. 

Tag 
A tag or release refers to an important snapshot in time, consistent across many files. 
These files at that point may all be tagged with a user-friendly, meaningful name or 
revision number. 

Trunk 
The unique line of development that is not a branch (sometimes also called Baseline or 
Mainline) 

Resolve 
The act of user intervention to address a conflict between different changes to the same 
document. 

Update 
An update (or sync) merges changes that have been made in the repository (e.g. by 
other people) into the local working copy. 

Working copy 
The working copy is the local copy of files from a repository, at a specific time or 
revision. All work done to the files in a repository is initially done on a working copy, 
hence the name. Conceptually, it is a sandbox. 


See also 


e Comparison of revision control software 
e List of revision control software - by category 


List of revision control software 
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This is a list of notable software for revision control. 


Distributed model 


In the distributed approach, each developer works directly with their own local repository, 
and changes are shared between repositories as a separate step. 


Open source 


e GNU arch — the original 

ArX — started as a fork of GNU arch, but has been completely rewritten. 
Monotone — fully-decentralized in a P2P way. 

Git — Designed by Linus Torvalds to address the needs of the Linux kernel 
project 

Bazaar — written in Python. Decentralised, and aims to be fast and easy to use. 
Can losslessly import Arch archives. 

Mercurial — written in Python. Decentralised and aims to be fast, lightweight, 
portable, and easy to use. 

Codeville — written in Python, uses an innovative merging algorithm. 

Darcs — written in the Haskell language, can keep track of inter-patch 
dependencies and automatically rearrange and "cherry-pick" them using a 
"theory of patches". 

SVK — written in Perl, built on top of Subversion to allow distributed commits. 
Aegis — mature software (but filesystem-oriented, poor network support) 
DCVS, a decentralized CVS-based version control system for distributed 
software development. 

LibreSource — a real virtual office allowing geographically spread teams to 
work simultaneously on a common project. LibreSource offers advanced 
functionality for configuration management with its generic synchronisation 
module So6, resulting from INRIA research works. The creation of development 
chains, validation processes and quality control processes can be done through 
the interconnection of synchronisers. 

ARCS — Automated Revision Control System. A front-end to GIT, CVS, 
Subversion and Darcs. Allowing real-time file-system monitoring, with the aim 
of simplification of the revision control process. 

tcldbrcs — Database driven RCS on PostgreSQL, no specific server outside of 
postgres. 


Proprietary 
e BitKeeper (was used in Linux kernel development 2002 - April 2005) 


Code Co-op — A peer-to-peer version control system (can use e-mail for 
synchronization) 
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TeamWare Designed by Larry McVoy, creator of  BitKeeper 
Wandisco multi-site / replication for CVS and Subversion 


Client-server model 


In the client-server model, developers use a shared single repository. 
Local only 


e Revision Control System (RCS) — A 1980s source code control system that 
features separate backward deltas for faster access to the trunk tip compared to 
SCCS and an improved user interface; the former at the cost of slow branch tip 
access and missing support for included/excluded deltas (.scc compliant). 

e SCCS 1970s — Source code control system which is part of UNIX. The original 
UNIX .scc program, based on interleaved deltas (and, contrary to common 
misbelief, not separate forward deltas/diffs); by included and excluded 
revisions, can construct versions as arbitrary sets of revisions (resp. the deltas 
associated with them). 


Open source 


e Concurrent Versions System (CVS) — Originally built on the Revision Control 
System. 
CVSNT - A cross-platform port of CVS that allows case insensitive file names 
among other changes. 
OpenCVS - Compatible with CVS, with emphasis put on security and source 
code correctness. 
Subversion — An open source version control tool. 
Vesta — A build system with a versioning file system and support for 
distributed repositories. Used at Intel for microprocessor design. 


Proprietary 


e AccuRev — A fast and easy-to-use SCM tool with integrated issue tracking based 
on "Streams" that efficiently manages parallel and global development. A 
replication server is also available. 

Alienbrain — A SCM tool by Avid Technology 

AllFusion Harvest Change Manager — Commercial change and configuration 
management tool from Computer Associates 

IBM Rational ClearCase — Market leading configuration management system 
from IBM Rational Software (.scc compliant) 

Cmvc — Configuration Management Version Control. A IBM version control 
system, no longer available. 

CollabNet Subversion — Subversion binaries compiled, certified and supported 
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by CollabNet, initiator of the Subversion open source project. 

Evolution — version management from ionForge. remote access, braching 
models, customizable workflow, with integrations into development, graphic, 
and modeling tools. 

FileHamster - Version Control Freeware from MOGware. FileHamster 

MKS Inc. — MKS Integrity is a process centric enterprise Application Lifecycle 
Management platform. 

MOG - Client server work flow and revision software for Video Game 
Developer. Developed by MOGware. 

Perforce — Built around a client-server architecture. Free for use in open 
source projects. 

Plastic SCM 

Project Overlord Asset/Project Management Software — client/server 
application designed specifically for CG animation/vfx studios. Gives the ability 
to track assets as well as different project related data such as shots and tasks. 

PVCS — Polytron Version Control System 

QVCS — Quma Version Control System 

Serena Dimensions, the successor to PVCS 

SourceAnywhere Hosted — A hosted source control solution. Published by 
Dynamsoft Corporation. 

SourceAnywhere Standalone — A SQL-based source control solution designed 
to be an alternative of Microsoft Visual SourceSafe. It is for developers who are 
ready to migrate away from VSS for performance, security, scalability, and 
reliability. SourceAnywhere Standalone is one product of SourceAnywhere 
product family published by Dynamsoft Corporation. 

SourceHaven — Originally based on Subversion, adds an embedded Oracle 
database backend and a web application for administration 
StarTeam, Borland — A robust platform for coordinating and managing the 
entire software delivery process, StarTeam promotes team communication and 
collaboration through centralized control of all of a project's digital assets and 
activities. 

Store, Cincom - A source code management and versioning system for Cincom's 
Visualworks Smalltalk environment. 

Surround SCM — Cross platform SCM tool for teams. Features a workflow 
capability to track what state changes have been made. 

Team Coherence — Integrated Version Control and Bug Tracking. 

Telelogic SYNERGY, a Task-Based CM system (.scc compliant) 

Vault (revision control system) — A version control tool by SourceGear 
designed to supplant Microsoft's VSS revision control software. 

Version Manager, ebiexperts — A data based Version Control tool able to smart 
compare Ms-Office, Xml, Pdf and Business Objects & Crystal Reports Reports 
and Universes. 

Visual SourceSafe, Microsoft — A version control tool provided by Microsoft 
oriented towards small teams. 

Visual Studio Team System, Microsoft — a process-based client-server suite of 


201 


NICOLAE SFETCU: WEB DESIGN & DEVELOPMENT 


tools for larger development organizations, incorporating work item tracking, 
reporting, build automation, unit and web testing, and integration with 
Microsoft Office 


See also 


e Comparison of revision control software 


e Revision control 


Comparison of revision control software 


The following tables compare general and technical information for notable revision control 
and software configuration management (SCM) software. This article is not all-inclusive and 
may become out of date quickly. 


General information 


Software Maintain Developme Reposito Concurren License Platforms 
er nt status ry model cy model supported 
AccuRev_ AccuRev, actively Client- Merge — orProprietary Unix-like, 
Inc. developed server lock Windows, 
Mac OS X 
Aldon Aldon actively Client- Merge — orProprietary Linux, 
developed server lock Windows, 
OS/400 
Bazaar Canonical actively Distribute Merge GPL Unix-like, 
Ltd. developed d Windows, 
Mac OS X 
ClearCase IBM actively | Client- Merge — orProprietary Unix-like, 
Rational developed server lock"! Windows, 
and i5/OS, z/OS 
Distribute 
d 
Code Co-Reliable actively — DistributeMerge Proprietary Windows 
op Software developed d 
Codeville Ross actively Distribute Merge BSD Unix-like, 
Cohen developed d Windows, 
Mac OS X 
CVS The CVSmaintained Client- Merge GPL Unix-like, 
Team but new server Windows, 
features not Mac OS X 
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Cost 


Unspecified/F 
ree 2-user 30 
day trial 
licenses 
available 
Unspecified 


Free 


$4250 per 
concurrent 
user plus tax 
(includes 12 
months 
support) 

$150 per seat 


Free 


Free 
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added 
CVSNT March ~ maintained Client- Merge 
Hare and newserver 
Software features 
and under 
communit developme 
y nt 
members. 
darcs David actively Distribute Merge 
Roundy developed d 
Git Junio Cactively Distribute Merge 
Hamano developed d 
GNU arch Andy Tai maintained Distribute Merge 
but newd 
features not 
added 
LibreSource Artenum maintained Client- Merge 
e and newserver 
Synchroniz features 
er under 
developme 
nt 
Mercurial Matt actively Distribute Merge 
Mackall developed d 
Monotone Nathaniel actively Distribute Merge 
Smith, developed d 
Graydon 
Hoare 
Perforce Perforce actively Client- Merge 
Software supported server Lock 
Inc. 
PlasticSC Codice actively _Client- _? 
M Software developed server 
SourceHav Veterisoft actively  Client- ? 
en , Inc. supported server 
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GPL or Unix-like, 

proprietary Windows, 
Mac OS X 
OS/400 


GPL Unix-like, 
Windows, 
Mac OS X 
POSIX, 

Windows, 
Mac OS X 
Unix-like, 
Windows, 
Mac OS X 


GPL 


GPL 


QPL Unix-like, 
Windows, 


Mac OS X 


GPL Unix-like, 
Windows, 
Mac OS X 
Unix-like, 
Windows, 
Mac OS X 


GPL 


or Proprietary Unix-like, 
Windows, 
Mac OS X 


Proprietary Unix-like, 
Windows, 
Mac OS X 
Proprietary Unix-like, 
Windows, 
Mac OS X 


Free 
commercial 


or 


Free 


Free 


Free 


Free 


Free 


Free 


Free for up to 
2 users, and 
for OSS 
development; 
else $800 per 
seat, with 
volume 
discounts 
$621 per seat, 
with volume 
discounts 
$295 per seat 
with volume 
discounts 
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StarTeam Borland actively Client- Merge  orProprietary Windows Included with 
developed server lock and BDS2006 
Crossplattfo Enterprise and 
rm via Javaup, otherwise 
based client Contact 
Borland 
Subversion CollabNet actively Client- Merge orApache/B Unix-like, Free 
(SVN) , Inc. developed server lock SD style Windows, (Commercial 
Mac OS X__ support/servic 
es available) 
Surround Seapine actively Client- Merge — orProprietary Unix-like, Commercial 
SCM Software developed server lock Windows, 
Mac OS X 
SVK Best actively Distribute Merge Artistic/G Unix-like, Free 
Practical developed d PL Windows, 
Mac OS X 
Telelogic Telelogic actively Client- Merge — orProprietary Linux, ? 
SYNERGY AB supported server Lock Windows 
Vault SourceGe actively §_Client- Lock orProprietary Unix-like, $249 per user 
arLLC supported server Merge Linux, 
Windows 
Visual Microsoft serious bugClient- Lock orProprietary Windows Bundled with 
Source fixes only server merge Visual Studio 
Safe 
Team Microsoft actively Client- Lock or Proprietary Server: Licensed 
Foundatio developed server merge Windows _ through 
n Server Server 2003; MSDN 
Clients: subscription 
Windows or through 
and  Webdirect buy 
included; 
Java client 
(for Eclipse 
IDE, 
MacOS, 
Unix) 
available 

1. “ Although Bazaar is a distributed version control system it can also be used in 
a centralized manner using lock step development and checkouts. 

2. “In ClearCase, a trigger may be set to allow for the lock model, and this is done 
at many sites. However, ClearCase development usually takes place on private 
branches where each developer is given their own branch, so the lock vs. merge 
concurrency model doesn't matter as much. Code is merged back to the main 
branch once the developer is ready to deliver their code to the project. 

3. “SVK allows Subversion to have distributed branches. 
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* In Subversion, a file attribute enables the lock model on per-file basis. This file 
attribute can be set automatically using file name wildcard expressions. 
Repository model describes the relationship between various copies of the 
source code repository. In a client-server model, users access a master 
repository via a client; typically, their local machines hold only a working copy 
of a project tree. Changes in one working copy must be committed to the master 
repository before they are propagated to other users. In a distributed model, 
repositories act as peers, and users typically have a local repository with 
version history available, in addition to their working copies. 

Concurrency model describes how changes to the working copy are managed to 
prevent simultaneous edits from causing nonsensical data in the repository. Ina 
lock model, changes are disallowed until the user requests and receives an 
exclusive lock on the file from the master repository. In a merge model, users 
may freely edit files, but are informed of possible conflicts upon checking their 
changes into the repository, whereupon the version control system may merge 
changes on both sides, or let the user decide when conflicts arise. Note that 
distributed version control almost always implies a merge concurrency model. 


Technical information 


Software Programming History RevisionIDs Repo.size Network protocols 
language model 

AccuRev Java Changeset Namespace O(revisions) custom 

Aldon C++, Java Snapshot Namespace O(revisions) custom 

Bazaar Python Snapshot Pseudorandom O(patch) HTTP, SFIP, FIP, 
ssh, custom, email 
bundles 

ClearCase ? Snapshot Namespace O(patch) HTTP, custom 
(CCEFS), custom 
(MVFS _ filesystem 
driver) 

Code Co-op) = -C++ Changeset User ID-Ordinal O(patch) e-mail (MAPI, 
SMTP/POP3, 
Gmail), LAN 

CVS C Snapshot Namespace O(patch) _ pserver, ssh 

CVSNT C++ Changeset Namespace O(patch) _ sspi, sserver, gserver, 
pserver, ssh 

darcs Haskell Patch Namespace O(patch) HTTP, ssh, email 

Git C, shell scripts Snapshot SHA-1 hashes O(patch) custom, ssh, rsync, 
HTTP, FTP, email, 
bundles 

GNU arch C, shell scripts Changeset Namespace O(patch) WebDAV, HTTP 

Mercurial Python, C Changeset SHA-1l hashes O(patch) HTTP, ssh, email 
(with plugin) 
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Monotone C++ Hybrid’) SHA-l hashes O(patch) — custom (netsync)'*! 

Perforce : Changeset Namespace O(patch) custom 

SourceHaven CC, Java Snapshot Namespace O(patch) WebDAV, custom 

StarTeam ? Snapshot MDS hashes O(tevisions) custom 

Subversion C Snapshot Namespace O(patch) WebDAV, — custom 
(svnserve)'®! 

SVK Perl Changeset ? ? q 

Vault C# Snapshot ? O(patch) HTTP, HTTPS 

Visual SourceC Snapshot Namespace? O(changes)? None, but can access 

Safe repository files 
through a "share" 

Team C++ and C# Changeset Namespace O(patch) SOAP over HTTP or 

Foundation HTTPS 

Server 

LibreSource Java Changeset Timestamp ofO(patch) HTTP, File-System 

Synchronizer the repository 

PlasticSCM C# Changeset Namespace O(revisions) custom 

1. “ Although Bazaar is a distributed version control system it can also be used in 
a centralized manner using lock step development and checkouts. 

2. “In ClearCase, a trigger may be set to allow for the lock model, and this is done 
at many sites. However, ClearCase development usually takes place on private 
branches where each developer is given their own branch, so the lock vs. merge 
concurrency model doesn't matter as much. Code is merged back to the main 
branch once the developer is ready to deliver their code to the project. 

3. “SVK allows Subversion to have distributed branches. 

4. “In Subversion, a file attribute enables the lock model on per-file basis. This file 
attribute can be set automatically using file name wildcard expressions. 

5. “ A Bazaar bundle is a summary diff, with sufficient extra information to 
preserve history. 

6. “ Mercurial 0.5b vs git. overview of performance/scalability. 

7. “ A Monotone's revisions represent changesets and its manifests represent 
snapshots, each revision is linked to some manifest. But manifests are legacy 
constructs, they aren't kept in the database anymore and reconstructed on the 
fly if needed. The real work now happen in rosters which are hybrid 
snapshot/changeset structures. 

8. *«»ssh tunneling is optional 


History model describes the form in which changes are stored in the repository. 
For example, when a change is committed, a system could store a copy of the 
tree before and after the change (snapshot), or it might instead store a copy of 
the tree before the change and a changeset representing the changes. 

Revision IDs are used internally to identify specific versions of files in the 
repository. Systems may use pseudorandom identifiers, content hashes of 
revisions, or filenames with sequential version numbers (namespace). 
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e Repo. size describes how the repository grows as changes are committed. 
O(patch) means that it grows as the size of the patches between revisions, while 
O(revisions) means that it grows as the size of each revision checked in. 

e Network protocols lists the protocols used for synchronization of changes. 


Features 


Software Atom File Merge SymbolPre/po Signed Merge End ofTags Internatio 


ic renam file ic links st- revision tracki line nal 

commes renam event s ng conversio Support 

it es hooks ns 
AccuRev Yes _ Yes Yes Yes Yes Yes Yes Yes Yes ? 
Aldon Yes Yes ? Yes Yes Yes Yes ? ? ? 
Bazaar Yes Yes Yes Yes Yes _ Partial!) Yes No Yes!!9l 2? 
ClearCase No Yes Yes Yes Yes Yes Yes Yes Yes ? 
Code Co-Yes Yes Yes No limited No No No Yes ? 
op support 
CVS No No No No limited No No Yes Yes ? 

support 
CVSNT Yes Yes ig No Yes No Yes Yes Yes 
darcs Yes Yes Yes No Yes Yes N/A!!! 2 Yes 
Git Yes Yes Yes Yes Yes Yes Yes Yes Yes 
(implic 
it) 
GNU arch Yes_ Yes ? Yes Yes Yes ? ? Yes 
Mercurial Yes Yes Yes Yes!!! Yes Yes Yes Yes Yes 
Monotone Yes Yes Yes  No!!*! Yes Yes, Yes No Yes 
mandato 
ry 

Perforce Yes Yes!'4] Yes Yes Yes Yes Yes Yes Yes No!!! 
SourceHa Yes_ Yes ? Yes Yes No ? ? ? ? 
ven 
StarTeam Yes''° Yes  ? Yes No No Yes ? Yes ? 

] 
Subversio Yes Yes!'’|) No Yes Yes No No!!®!_ Yes Partial! Yes 
in 19] 
SVK Yes Yes No Yes Yes! 9 Yes Yes Yes 
Vault Yes Yes Yes No Yes No No Yes Yes 
Visual No Yes ? Yes Yes No No ? Yes 
Source (using 
Safe shares) 
Team Yes Yes Yes ? Yes ? Yes ? Yes ? 
Foundatio 
n Server 
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LibreSour Yes Yes Yes No limited No Yes 7INo Yes ? 
ce support 
Synchroni Bu 
zer 
Plastic Yes Yes Yes Yes Yes Yes Yes ? Yes ? 
SCM 
1. “ Although Bazaar is a distributed version control system it can also be used in 


0 


16. 
17. 


a centralized manner using lock step development and checkouts. 

* In ClearCase, a trigger may be set to allow for the lock model, and this is done 
at many sites. However, ClearCase development usually takes place on private 
branches where each developer is given their own branch, so the lock vs. merge 
concurrency model doesn't matter as much. Code is merged back to the main 
branch once the developer is ready to deliver their code to the project. 

* SVK allows Subversion to have distributed branches. 

* In Subversion, a file attribute enables the lock model on per-file basis. This file 
attribute can be set automatically using file name wildcard expressions. 

* A Bazaar bundle is a summary diff, with sufficient extra information to 
preserve history. 

“ Mercurial 0.5b vs git. overview of performance/scalability. 

* A Monotone's revisions represent changesets and its manifests represent 
snapshots, each revision is linked to some manifest. But manifests are legacy 
constructs, they aren't kept in the database anymore and reconstructed on the 
fly if needed. The real work now happen in rosters which are hybrid 
snapshot/changeset structures. 

‘ a» ssh tunneling is optional 

* They can be automatically generated and manually verified, but verification is 
not automatic 


.* Tags were introduced into Bazaar in version 0.15. Previous versions needed 


to use a plug in. 


.* darcs' patches each bear a unique identifier, it's pretty much impossible to 


merge twice the same patch in a repository 


.‘* Mercurial versions 0.9.4 and higher support symlinks. 

. “It could be done via user level hooks 

. “ http://www.perforce.com/perforce/technotes/note007.html 

.‘* p4 supports unicode files. Files marked as unicode are converted to and from 


the user's local codepage unless that codepage cannot represent every 
character in the file. Therefore a unicode file with Japanese in it can not be 
checked out by a user not running a Japanese OS. Set the file type to text and p4 
will not accept non ascii characters. Set the file to binary and perforce will not 
be able to do auto-merge, a required feature for source code. Therefore, p4 
cannot be used for international software development. 

* StarTeam supports atomic commits as of version 2006 

* Subversion can move a file and conserve its history, if and only if the target of 
the move is in the same Subversion repository as the source. Cross-repository 


208 


18. 


19. 


NICOLAE SFETCU: WEB DESIGN & DEVELOPMENT 


moves require third-party tools such as svk. Also, a rename operation is actually 
a copy-with-history-and-delete sequence. 

* A separate tool "svnmerge" provides merge tracking, see. Merge tracking will 
be included in Subversion itself in the next major release, see 

* In Subversion, tags are a special case of the more generic "cheap copy" 
concept of Subversion. Per convention, a tag is a copy into a directory named 
"tags". Because of this, even tags are versioned. See for more information. The 
reason for partial support in the table is because Subversion's emulation of tags 
in this manner does not meet the requirement that the tag name can be used in 
place of any revision identifier wherever the user may be required to enter one. 
This column would be meaningless if the definition were to be loosened enough 
to encompass Subversion's approach as every version control system supports 
branching and would therefore support tags as well. 


. ‘ Uses subversion server 
21. 
22. 


* Its possible to embed the action in a shell or Ant script. 

* A merged is tracked by its workspace origin. 

Atomic commits refers to a guarantee that all commits are merged fully, or not at 
all. 

File renames describes whether a system allows files to be renamed while 
retaining their version history. 

Merge file renames describes whether a system can merge changes made to a 
file on one branch into the same file that has been renamed on another branch 
(or vice versa). If the same file has been renamed on both branches then there is 
a rename conflict that the user must resolve. 

Symbolic links describes whether a system allows revision control of symbolic 
links as with regular files. Versioning symbolic links is considered by some 
people a feature and some people a security breach (eg a symbolic link to 
/etc/passwd). Symbolic links are only supported on select platforms, depending 
on the software. 

Pre/post event hooks indicates the capability to trigger commands before or 
after an action, such as a commit, takes place. 

Signed revisions refers to integrated digital signing of revisions, in a format such 
as OpenPGP. 

Merge tracking describes whether a system remembers what changes have 
been merged between which branches and only merges the changes that are 
missing when merging one branch into another. 

End of line conversions describes whether a system can adapt the end of line 
characters for text files such that they match the end of line style for the 
operating system under which it is used. The granularity of control varies. 
Subversion, for example, can be configured to handle EOLs differently according 
to the file type, whereas Perforce converts all text files according a single, per- 
client setting. 

Tags indicates if meaningful names can be given to specific revisions. 
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User interfaces 


Software 
AccuRev 


Aldon 
Bazaar 


ClearCase 
Code Co-op 
CVS 
CVSNT 
darcs 

Git 

GNU arch 
Mercurial 


Monotone 
Perforce 


SourceHaven 


StarTeam 
Subversion 


SVK 
Vault 
Visual 
Safe 


Team 
Foundation 
Server 
LibreSource 


Synchronizer 


Plastic SCM 
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Web interfaces GUIs 

Yes Windows, Linux, Unix, Mac OS X, BeOS 
available 

Yes Windows 


can use a plain webserver, Olive, bzr-gtk (GTK+), QBzr (Qt) 

webserve, loggerhead or Trac 

included, Clearcase WebWindows, interface for Unix-like systems is 
Interface mixed command-line and GUI 

Not necessary since entire Windows 

project is replicated locally 
cvsweb, ViewVC, others Windows (incl. explorer integration), Mac OS 
X, GTK, Qt available 

Windows, Mac OS X, OS/400, GTK, Qt 
available 

darcs.cgi included; darcsweb under development; Windows, Mac OS X 
(alpha) available 

gitk, git-vgui (Tcl/Tk), tig, qgit. GTK, (h)gct 
(Qt) available 


cvsweb, ViewVC, others 


gitweb, wit, cgit 


ArchZoom ArchWay (GTK2), TlaLog 

included '?7! Hek (Tcl/Tk), (h)gct (Qt), Mercurial Eclipse 

ViewMTN GTK+, Qt available 

included, P4Web, P4FTP Windows, Linux, Unix, Mac OS X, BeOS 
available 

included Windows, Linux, Unix, Mac OS X, BeOS 
available 

included Windows, Java, BDS2006 integration 


Apache 2 module included, Qt, Windows (incl. explorer integration), Java, 


WebSVN, ViewSVN,Mac OS X [37], Eclipse, Netbeans, IDEA, 
ViewVC, Trac Nautilus available 

? ? 

included Windows, Unix-like, Mac OS X 


Sourcenone included; SSWI, VSS Windows included; Linux, Mac OS and Solaris 


Remoting using SourceOffSite; any Java VM_ using 
SourceAny Where 
included (Team System Web Windows included; Java client for Eclipse IDE, 


Access) MacOS, Unix available 


Windows, Linux, Unix, Mac OS X available 
[24] 


LibreSource 


Not necessary since entire Windows, Linux, Unix, Mac OS X available 
project is replicated locally 
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* Although Bazaar is a distributed version control system it can also be used in 
a centralized manner using lock step development and checkouts. 

* In ClearCase, a trigger may be set to allow for the lock model, and this is done 
at many sites. However, ClearCase development usually takes place on private 
branches where each developer is given their own branch, so the lock vs. merge 
concurrency model doesn't matter as much. Code is merged back to the main 
branch once the developer is ready to deliver their code to the project. 

* SVK allows Subversion to have distributed branches. 

* In Subversion, a file attribute enables the lock model on per-file basis. This file 
attribute can be set automatically using file name wildcard expressions. 

* A Bazaar bundle is a summary diff, with sufficient extra information to 
preserve history. 

“ Mercurial 0.5b vs git. overview of performance/scalability. 

* A Monotone's revisions represent changesets and its manifests represent 
snapshots, each revision is linked to some manifest. But manifests are legacy 
constructs, they aren't kept in the database anymore and reconstructed on the 
fly if needed. The real work now happen in rosters which are hybrid 
snapshot/changeset structures. 

‘ a» ssh tunneling is optional 

* They can be automatically generated and manually verified, but verification is 
not automatic 


.* Tags were introduced into Bazaar in version 0.15. Previous versions needed 


to use a plug in. 


.* darcs' patches each bear a unique identifier, it's pretty much impossible to 


merge twice the same patch in a repository 


.‘* Mercurial versions 0.9.4 and higher support symlinks. 

. “It could be done via user level hooks 

. “ http://www.perforce.com/perforce/technotes/note007.html 

.* p4 supports unicode files. Files marked as unicode are converted to and from 


the user's local codepage unless that codepage cannot represent every 
character in the file. Therefore a unicode file with Japanese in it can not be 
checked out by a user not running a Japanese OS. Set the file type to text and p4 
will not accept non ascii characters. Set the file to binary and perforce will not 
be able to do auto-merge, a required feature for source code. Therefore, p4 
cannot be used for international software development. 

* StarTeam supports atomic commits as of version 2006 

* Subversion can move a file and conserve its history, if and only if the target of 
the move is in the same Subversion repository as the source. Cross-repository 
moves require third-party tools such as svk. Also, a rename operation is actually 
a copy-with-history-and-delete sequence. 

* A separate tool "svnmerge" provides merge tracking, see. Merge tracking will 
be included in Subversion itself in the next major release, see 

* In Subversion, tags are a special case of the more generic "cheap copy" 
concept of Subversion. Per convention, a tag is a copy into a directory named 
"tags". Because of this, even tags are versioned. See for more information. The 


211 


NICOLAE SFETCU: WEB DESIGN & DEVELOPMENT 


reason for partial support in the table is because Subversion's emulation of tags 
in this manner does not meet the requirement that the tag name can be used in 
place of any revision identifier wherever the user may be required to enter one. 
This column would be meaningless if the definition were to be loosened enough 
to encompass Subversion's approach as every version control system supports 


branching and would therefore support tags as well. 

20. “ Uses subversion server 

21. * Its possible to embed the action in a shell or Ant script. 

22. A merged is tracked by its workspace origin. 

23.” hgweb for single-repository access and hgwebdir for multiple repository 
access from a single HTTP address 

24. * Any OS that support a Java Virtual Machine 1.5 


History and adoption 
Software History Notable users 
AccuRev First publicly released on May 18th,Clients include: SanDisk, Sony, Orbitz, 
1999 MCI, and Polycom 
Aldon Founded in 1979 to provide softwareClients include: Nintendo, AIG, Kraft 
configuration and change management Foods, Banner Health, EON Bank, and 
solutions to IT Masterbrand Cabinets 
Bazaar Loosely related to baz Ubuntu, Launchpad, Drupal (but for 
main repository CVS - still used), 
KatchTV 
ClearCase Developed beginning in 1990 by AtriaIBM, Motorola, Siemens, Ericsson, 
Software, following conceptsNokia and other large organizations 
developed by Apollo Computer inworldwide 
DSEE during the 1980s 
Code Co-op _ The first distributed VCS, demoed inClients include: Logitech, HP, Ericsson 
1997, released soon after. 
CVS First publicly released July 3, 1986;thousands of organizations worldwide 
based on RCS 
CVSNT First publicly released 1998; based ontoo many to list. 
CVS. Started by CVS developers with 
the goal adding support for a wider 
range of development methods and 
processes. 
darcs First announced on April 9, 2003 DokuWiki, Mnet, Psi, Projects Using 
Darcs 
Git Started by Linus Torvalds in AprilLinux kernel, Cairo, ELinks, Wine, 
2005, following the  BitKeeperX.Org, XCB, XMMS2 
controversy 
GNU arch Started by Tom Lord, it later becameavailable for GNU Savannah and 
part of the GNU project. LordGna.org projects 


resigned as maintainer in August 
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2005. 

Mercurial Started April 6, 2005 by Matt Mackall,e2fsprogs, Mozilla, Xine, Xen, 
following the BitKeeper OpenJDK, OpenSolaris, wmii, 
controversy.) First released on April MoinMoin 
19, 2005 

Monotone First released in April 2003 

Perforce Developed by Perforce Software, Incmany organizations worldwide, Perl 5, 
which was founded in 1995 byFreeBSD, Trolltech 
Christopher Seiwald. 

StarTeam Developed by StarBase software, Borland, BT Group, Cintax corporation 
acquired by Borland. 

Subversion Started by CVS developers with goalASF, SourceForge, KDE, GNOME, 
of replacing CVS GCC, Ruby, Ruby on Rails, Python, 

Samba, Mono, PuTTY, Zope, Xiph, 
GnuPG, CUPS, Wireshark, TWiki, and 
many organizations worldwide 

SVK ? ? 

Vault ? ? 

Visual Source? ? 

Safe 

Team First publicly released on March, 2006 Available on CodePlex, Microsoft itself 

Foundation and other large organizations worldwide 

Server 


LibreSource _ First publicly released on June 13th, Most of the LibreSource Community 
Synchronizer 2005 
PlasticSCM Developed by Codice Software, Inc many organizations 

which was founded in 2005 by Pablo 

Santos and David Suarez 

e Notable users is a list of well known projects using the software as their primary 
revision control system, excluding the software itself, followed by a link to a full 
list if available. 

e History briefly describes the software's origins and development. 

* Although Bazaar is a distributed version control system it can also be used in 
a centralized manner using lock step development and checkouts. 

2. “In ClearCase, a trigger may be set to allow for the lock model, and this is done 
at many sites. However, ClearCase development usually takes place on private 
branches where each developer is given their own branch, so the lock vs. merge 
concurrency model doesn't matter as much. Code is merged back to the main 
branch once the developer is ready to deliver their code to the project. 

3. “SVK allows Subversion to have distributed branches. 

4. “In Subversion, a file attribute enables the lock model on per-file basis. This file 
attribute can be set automatically using file name wildcard expressions. 

5. ®“ A Bazaar bundle is a summary diff, with sufficient extra information to 
preserve history. 
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“ Mercurial 0.5b vs git. overview of performance/scalability. 

* A Monotone's revisions represent changesets and its manifests represent 
snapshots, each revision is linked to some manifest. But manifests are legacy 
constructs, they aren't kept in the database anymore and reconstructed on the 
fly if needed. The real work now happen in rosters which are hybrid 
snapshot/changeset structures. 

‘ a» ssh tunneling is optional 

* They can be automatically generated and manually verified, but verification is 
not automatic 


.* Tags were introduced into Bazaar in version 0.15. Previous versions needed 


to use a plug in. 


.* darcs' patches each bear a unique identifier, it's pretty much impossible to 


merge twice the same patch in a repository 


. ‘* Mercurial versions 0.9.4 and higher support symlinks. 
.* It could be done via user level hooks 


A 


.‘* p4 supports unicode files. Files marked as unicode are converted to and from 


the user's local codepage unless that codepage cannot represent every 
character in the file. Therefore a unicode file with Japanese in it can not be 
checked out by a user not running a Japanese OS. Set the file type to text and p4 
will not accept non ascii characters. Set the file to binary and perforce will not 
be able to do auto-merge, a required feature for source code. Therefore, p4 
cannot be used for international software development. 

* StarTeam supports atomic commits as of version 2006 

* Subversion can move a file and conserve its history, if and only if the target of 
the move is in the same Subversion repository as the source. Cross-repository 
moves require third-party tools such as svk. Also, a rename operation is actually 
a copy-with-history-and-delete sequence. 

* A separate tool "svnmerge" provides merge tracking, see. Merge tracking will 
be included in Subversion itself in the next major release, see 

* In Subversion, tags are a special case of the more generic "cheap copy" 
concept of Subversion. Per convention, a tag is a copy into a directory named 
"tags". Because of this, even tags are versioned. See for more information. The 
reason for partial support in the table is because Subversion's emulation of tags 
in this manner does not meet the requirement that the tag name can be used in 
place of any revision identifier wherever the user may be required to enter one. 
This column would be meaningless if the definition were to be loosened enough 
to encompass Subversion's approach as every version control system supports 
branching and would therefore support tags as well. 

* Uses subversion server 

* Its possible to embed the action in a shell or Ant script. 

* A merged is tracked by its workspace origin. 

* hgweb for single-repository access and hgwebdir for multiple repository 
access from a single HTTP address 

* Any OS that support a Java Virtual Machine 1.5 
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25. * Towards A Better SCM: Revlogs and Mercurial, presented by Matt Mackall to 
the Ottawa Linux Symposium, July 2006 


See also 


e List of revision control software 
Wiki 
A wiki is a collaborative website which can be directly edited by anyone with access to it. 
Ward Cunningham, developer of the first wiki WikiWikiWeb, originally described it as "the 


simplest online database that could possibly work".[1] Wikipedia is one of the best-known 
wikis.! 


History 


WikiWikiWeb was the first site to be called a wiki.[2] Ward Cunningham started developing 
WikiWikiWeb in 1994, and installed it on Internet domain c2.com on March 25, 1995. It was 
named by Cunningham, who remembered a Honolulu International Airport counter employee 
telling him to take the so-called "Wiki Wiki" Chance RT-52 shuttle bus line that runs between 
the airport's terminals. According to Cunningham, "I chose wiki-wiki as an alliterative substitute 
for 'quick' and thereby avoided naming this stuff quick-web."""Wiki Wiki is a reduplication of 
wiki, a Hawatian-language word for fast. The word "wiki" (/wiDkiD wiDkiD/) is a shorter form 
of wiki wiki . 

Cunningham was in part inspired by Apple's HyperCard. Apple had designed a system 
allowing users to create virtual “card stacks” supporting links among the various cards. 
Cunningham developed Vannevar Bush's ideas by allowing users to "comment on and change 
one another's text".[2][5] In the early 2000s, wikis were increasingly adopted in enterprise as 
collaborative software. Common uses included project communication, intranets, and 
documentation, initially for technical users. Today some companies use wikis as their only 
collaborative software and as a replacement for static intranets. There may be greater use of 
wikis behind firewalls than on the public Internet. 

On March 15, 2007, wiki entered the Oxford English Dictionary Online.[6] Wiki is 
sometimes interpreted as the "backronym" for what I know is, which describes the knowledge 
contribution, storage, and the exchange function.'”! 


Trustworthiness 


Critics of open-source wiki systems argue that these systems could be easily tampered with; 
while proponents argue that the community of users can catch malicious content and correct it.! 


Lars Aronsson, a data systems specialist, summarizes the controversy as follows: 
Most people, when they first learn about the wiki concept, assume that a website that can be edited by anybody would soon be rendered 


useless by destructive input. It sounds like offering free spray cans next to a grey concrete wall. The only likely outcome would be ugly 


graffiti and simple tagging, and many artistic efforts would not be long lived. Still, it seems to work very well. 
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Characteristics 


A wiki enables documents to be written collaboratively, in a simple markup language using a 
web browser. A single page in a wiki is referred to as a "wiki page", while the entire body of 
pages, which are usually highly interconnected via hyperlinks, is "the wiki". A wiki is essentially 
a database for creating, browsing and searching information. 

A defining characteristic of wiki technology is the ease with which pages can be created and 
updated. Generally, there is no review before modifications are accepted. Many wikis are open to 
the general public without the need to register any user account. Sometimes session log-in is 
requested to acquire a "wiki-signature" cookie for autosigning edits. Many edits, however, can be 
made in real-time, and appear almost instantaneously online. This can lead to abuse of the 
system. Private wiki servers require user authentication to edit, sometimes even to read pages. 


Editing wiki pages 


Traditionally content structure and formatting on wikis is implemented with a simplified 
markup language, sometimes known as "wikitext". For example, starting a line of text with an 
asterisk ("*") is often used to code an item in a bulleted list. Style and syntax of wikitexts can 
vary a great deal among wiki implementations, some of which also allow HTML tags. 

The reasoning behind this design is that HTML, with its many cryptic tags, is not especially 
human-readable. The actual text content is hard to read within HTML, making it difficult to edit. 
Wikis therefore favour plain-text editing with a few simple conventions of wikitext for structure 
and style. 

MediaWiki syntax Equivalent HTML Rendered output 
<p> 
"<i>Doctor</i>? No other title? A 

"Doctor"? No other title? A<i>scholar</i>? And he rates 


" 9 itle? 
"scholar"? And he rates aboveabove the civil authority?" Doctor? No other title? A 
‘iva civil aiuthoritye" </p> scholar? And he rates above 


the civil authority?" 

"Why, certainly," replied 
Hardin, amiably. "We're all 
scholars more or less. After 
all, we're not so much a world 
as a scientific foundation — 


"Why, certainly," —_ replied <p> 

Hardin, amiably. "We're all "Why, certainly," replied Hardin, 
scholars more or less. After all, amiably. "We're all scholars more 
we're not so much a world asor less. After all, we're not so 


a_ scientific foundation —much a world as a_ scientific . 
. ; ‘ under the direct control of the 
under the direct control of thefoundation — under the direct " 
i Emperor. 
Emperor. control of the Emperor. 
</p> 


(Quotation above from Foundation by Isaac Asimov) 

Although limiting access to HTML and Cascading Style Sheets (CSS) of wikis limits user 
ability to alter the structure and formatting of wiki content, there are some benefits. Limited 
access to CSS promotes consistency in the look and feel and having JavaScript disabled prevents 
a user from implementing code, which may limit access for other users. 
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Increasingly, wikis are making "WYSIWYG" ("What You See Is What You Get") editing 
available to users, usually by means of JavaScript or an ActiveX control that translates 
graphically entered formatting instructions, such as "bold" and "italics", into the corresponding 
HTML tags or wikitext. In those implementations, the markup of a newly-edited marked-up 
version of the page is generated and submitted to the server transparently, and the user is 
shielded from this technical detail. WYSIWYG editors of wikis have nowhere near the capabilies 
of the average word processor and usually do not produce good code. 

Many implementations (for example MediaWiki) allow users to supply an "edit summary" 
when they edit a page. This is a short piece of text (usually one line) summarizing the changes. It 
is not inserted into the article, but is stored along with that revision of the page, allowing users to 
explain what has been done and why; this is similar to a log message when committing changes 
to a revision control system. 

Most wikis keep a record of changes made to wiki pages; often every version of the page is 
stored. This means that authors can revert to an older version of the page, should it be necessary 
because a mistake has been made or the page has been vandalised. 


Controlling changes 


Wikis are generally designed with the philosophy of making it easy to correct mistakes, 
rather than making it difficult to make them. Thus, while wikis are very open, they provide a 
means to verify the validity of recent additions to the body of pages. The most prominent, on 
almost every wiki, is the "Recent Changes" page—a specific list numbering recent edits, or a list 
of all the edits made within a given time frame.'! Some wikis can filter the list to remove minor 
edits and edits made by automatic importing scripts ("bots")."”! 

From the change log, other functions are accessible in most wikis: the Revision History 
showing previous page versions; and the diff feature, highlighting the changes between two 
revisions. Using the Revision History, an editor can view and restore a previous version of the 
article. The diff feature can be used to decide whether or not this is necessary. A regular wiki 
user can view the diff of an edit listed on the "Recent Changes" page and, if it is an unacceptable 
edit, consult the history, restoring a previous revision; this process is more or less streamlined, 
depending on the wiki software used."! 

In case unacceptable edits are missed on the "Recent Changes" page, some wiki engines 
provide additional content control. It can be monitored to ensure that a page, or a set of pages, 
keeps its quality. A person willing to maintain pages will be warned of modifications to the 
pages, allowing him or her to verify the validity of new editions quickly."! 


Security 


The open philosophy of most wikis, allowing anyone to edit content, does not ensure that all 
editors are well-meaning. Vandalism can be a major problem. In larger wiki sites, such as those 
run by the Wikimedia Foundation, vandalism can go unnoticed for a period of time. Wikis by 
their very nature are susceptible to intentional disruption, known as "trolling". Wikis tend to take 
a soft security"! approach to the problem of vandalism; making damage easy to undo rather than 
attempting to prevent damage. Larger wikis often employ sophisticated methods, such as bots 
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that automatically identify and revert vandalism and JavaScript enhancements that show how 
many characters have been added in each edit. In this way vandalism can be limited to just 
"minor vandalism" or "sneaky vandalism", where the characters added/eliminated are so few that 
bots don't identify them and users don't pay much attention to them. 

The amount of vandalism a wiki receives depends on how open the wiki is. For instance, 
some wikis allow unregistered users, identified by their IP addresses, to edit content, whilst 
others limit this function to just registered users. What most wikis do is allow IP editing, but 
privilege registered users with some extra functions to lend them a hand in editing; on most 
wikis, becoming a registered user is very simple and can be done in seconds, but detains the user 
from using the new editing functions until either some time passes, as in the English Wikipedia, 
where registered users must wait for three days after creating an account in order to gain access 
to the new tool, or until several constructive edits have been made in order to prove the user's 
trustworthiness and usefulness on the system, as in the Portuguese Wikipedia, where users 
require at least 15 constructive edits before authorization to use the added tools. Basically, 
"closed up" wikis are more secure and reliable but grow slowly, whilst more open wikis grow at 
a steady rate but result in being an easy target for vandalism. 


Linking and creating pages 


Hierarchical navigation menus are often not useful in wikis as multiple authors create and 
delete pages in an ad hoc manner. Non-linear navigational methods are more appropriate. 
Typically each page contains a large number of hypertext links to other pages. Links are created 
using a specific syntax, the so-called "link pattern" or CURIE syntax. 

Originally, most wikis used CamelCase when naming program identifiers. These are 
produced by capitalizing words in a phrase and removing the spaces between them (the word 
"CamelCase" is itself an example). While CamelCase makes linking very easy, it also leads to 
links which are written in a form that deviates from the standard spelling. CamelCase-based 
wikis are instantly recognizable because they have many links with names such as 
"TableOfContents" and "BeginnerQuestions". It is possible for a wiki to render the visible 
anchor for such links "pretty" by reinserting spaces, and possibly also reverting to lower case. 
However, this reprocessing of the link to improve the readability of the anchor is limited by the 
loss of capitalization information caused by CamelCase reversal. For example, "RichardWagner" 
should be rendered as "Richard Wagner", whereas "PopularMusic" should be rendered as 
"popular music". There is no easy way to determine which capital letters should remain 
capitalized. As a result, many wikis now have "free linking" using brackets, and some disable 
CamelCase by default. 


Searching 


Most wikis offer at least a title search, and sometimes a full-text search. The scalability of the 
search depends on whether the wiki engine uses a database. Indexed database access is necessary 
for high speed searches on large wikis. Alternatively, external search engines such as Google can 
sometimes be used on wikis with limited searching functions in order to obtain more precise 
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results. However, a search engine's indexes can be very out of date (days, weeks or months) for 
many websites. 


Wiki software architecture 


Nearly all wikis are implemented as server software. However, some have been implemented 
purely on the client-side using Javascript, and some have been built on top of peer to peer 
networks. 


Wiki communities 


Many wiki communities are private, particularly within enterprises. They are often used as 
internal documentation for in-house systems and applications. The "open to everyone", all- 
encompassing nature of Wikipedia is a significant factor in its growth, while many other wikis 
are highly specialized. 

There also exist WikiNodes which are pages on wikis that describe related wikis. They are 
usually organized as neighbors and delegates. A neighbor wiki is simply a wiki that may discuss 
similar content or may otherwise be of interest. A delegate wiki is a wiki that agrees to have 
certain content delegated to that wiki. 

One way of finding a wiki on a specific subject is to follow the wiki-node network from wiki 
to wiki; another is to take a Wiki "bus tour," for example: Wikipedia's Tour Bus Stop. Domain 
names containing "wiki" are growing in popularity to support specific niches. 

For those interested in creating their own wiki, there are many publicly available "wiki 
farms", some of which can also make private, password-protected wikis. PeanutButterWiki, 
Socialtext, Wetpaint, and Wikia are popular examples of such services. For more information, 
see List of wiki farms. Note that free wiki farms generally contain advertising on every page. For 
those interested in how to build a successful wiki community, and encourage wiki use, 
Wikipatterns is a guide to the stages of wiki adoption and a collection of community-building 
and content-building strategies. 

The English-language Wikipedia has the largest user base among all wikis''" and ranks in the 
top twenty among all websites in terms of traffic.!'?) Other large wikis include the WikiWikiWeb, 
Memory Alpha, Wikitravel, World66 and Susning.nu, a Swedish-language knowledge base. The 
largest wikis are listed and updated on Wikimedia's "meta" wiki. 


Wikis and content management systems 


Wikis have shared and encouraged several features with generalized content management 
systems (CMS), which are used by enterprises and communities-of-practice. Those looking to 
compare a CMS with an enterprise wiki should consider these basic features: 

1. The name ofan article is embedded in the hyperlink. 

2. Articles can be created or edited at anytime by anyone (with certain limitations 
for protected articles). 

3. Articles are editable through the web browser. 

4. Each article provides one-click access to the history/versioning page, which also 
supports version differencing ("diff") and retrieving prior versions. 
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5. The most recent additions/modifications of articles can be monitored actively 
or passively. 
6. Easy revert of changes is possible. 

None of these are particular to a wiki, and some have developed independently. Still the 
concept of a wiki unequivocally refers to this core set of features. Taken together, they fit the 
generative nature of the Internet, in encouraging each user to help build it.) It is yet to be 
studied whether an enterprise wiki encourages more usage, or leads to more knowledgeable 
community members, than other content management systems. 


See also 


e = List of wikis 

e = List of wiki software 

e Content management system 
Wiktionary 

e Wiki 

e Wickiup 
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Wiki software | List of wikis 


Wiki software 
Wiki software is a type of collaborative software that runs a wiki system. This typically 


allows web pages to be created and edited using a common web browser. It is usually 
implemented as a server-side script that runs on one or more web servers, with the content 


220 


NICOLAE SFETCU: WEB DESIGN & DEVELOPMENT 


generally stored in a relational database management system, although some implementations 
use the server's file system instead. 


Class 
Web-based wiki 


The first such system was created by Ward Cunningham in 1995, but given the relative 
simplicity of the wiki concept, a large number of implementations now exist, ranging from very 
simple "hacks" implementing only core functionality to highly sophisticated content 
management systems. The primary difference between wikis and more complex types of content 
management systems is that wiki software tends to focus on the content, at the expense of the 
more powerful control over layout seen in CMS software like Drupal, WebGUI, or at the 
expense of non-wiki features (articles, blogs,..) like in TikiWiki. 

"Wiki software" could be interpreted as comprising all of the software required to run a wiki, 
which might include a web server such as Apache, in addition to the Wiki engine itself, which 
implements the wiki technology. In some cases, such as ProjectForum, or some WikiServers, the 
web server and wiki engine are bundled together as one self-contained system, which can often 
make them easier to install. 

The majority of engines are open source, often available under the GNU General Public 
License (GPL); large projects such as TWiki and the Wikipedia engine, MediaWiki, are 
developed collaboratively. Many wikis are highly modular, providing APIs which allow 
programmers to develop new features without requiring them to be familiar with the entire 
codebase. 

It is hard to determine which wiki engines are the most popular, although a list of lead 
candidates might include TWiki, MoinMoin, PmWiki, DokuWiki and MediaWiki (Google trend 
history comparison). A list of some of those available is included below, and another can be 
found at Wiki: WikiEngines. 


Personal wiki 
Some wiki software is not intended for collaborative work, but for either content 


management or for personal information organizing. This is sometimes called a Desktop Wiki or 
a Personal Wiki. 


See also 
e List of wiki software 
e Comparison of wiki software - feature-table comparison of common wiki 


software 
e List of wikis - websites using wiki software, organized by topic 


List of wiki software 
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This is a list of notable wiki software applications. For a list of Websites using wiki 
software, organized by topic, see List of wikis. 


Java-based 


e Clearspace is a commercial J2EE application, made by Jive Software, which 
combines Wiki, Blog, and Document Management functionality into a complete 
enterprise collaboration solution. Clearspace uses wiki-style markup or 
WYSIWYG or "What you see is what you get" editing to allow for clean version 
control and workflow management. 

e Confluence (wiki) is a commercial J2EE application which combines Wiki and 
some blog functionality. Its features include PDF page export and page 
refactoring, and it can be run on any application server using any RDBMS 
backend. 

e Corendal Wiki is a GPL application for corporate environments, with tight 
integration with Active Directory. 

e JAMWiki is a Java-based wiki engine that is built to possess many of the features 
of MediaWiki 

e JSPWiki is aJ2EE application released under the LGPL. 

e Kerika is a commercial Java application which combines whiteboarding and 
document management. It works on a peer-to-peer model and is available for 
Windows, Macs and Linux. Team members who are not using Kerika get their 
project updates sent to them automatically by email. 

e Mindquarry creates a WYSIWYG wiki for each team. It is built using Apache 
Cocoon and thus based on Java (Mozilla Public License) 

e SnipSnap is a Java-based package that combines Wiki and blog concepts. It 
includes its own web server, but can be built as a war file for use in other servlet 
engines. Released under the GNU General Public License 

e Traction TeamPage is a commercial application based on the principles of 
Douglas Engelbart's On-Line System (NLS) which aggregates multiple blog / 
Wiki spaces using a sophisticated permission and inline comment model. 

e VQ Wiki "very quick wiki" - does not require a database. 

e Yawiki is a wiki. Documentation is released under GNU General Public License. 
has a (robot) YawikiBot, an Artificial Intelligent front end. Yawiki is J2EE and 
the rich client used for editing is Java Web Start. 

e XWiki is a Java wiki engine with a complete wiki feature set (version control, 
attachments, etc.) and a database engine and programming language which 
allows database driven applications to be created using the wiki interface 


JavaScript-based 


e TiddlyWiki is a HTML/JavaScript-based server-less wiki in which the entire site 
is contained in a single file. 
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Lisp-based 


CLiki is a free collaborative hypertext authoring program written in Common 
Lisp. Modelled on Wiki, it is free software released under the MIT license. It 
presently runs in SBCL and requires Araneida which needs the SBCL socket 
library. Considered extremely powerful, it has been implemented at cliki-net, 
metacircles.com, and cliki.tunes.org 

Kiwi is a heavily reworked port of CLiki to Allegro Common Lisp running the 
Allegroserve webserver. 

Svnwiki is written in Scheme Chicken 


Microsoft services-based 


WikiPoint by Neoworks is a wiki extension for Windows SharePoint Services 
version 2 and Microsoft Office SharePoint Portal Server 2003. 

Windows SharePoint Services version 3 has built-in Wiki support. It is built on 
ASP.Net and Microsoft SQL Server. 


-NET/Mono based 


ArtificialMemory is a Semantic Wiki written in C# and Microsoft SQL Server. 
DotWiki a Wiki clone using Visual Basic .NET and Microsoft SQL Server. 
FlexWiki is written in C#, uses the .NET framework, and stores data in XML files 
or Microsoft SQL Server. Binary and Source Code are available for download. 
WWWiiki by ktomics is a DotNetNuke module which offers page versioning and 
several other features. 

Perspective is written in C#, uses the .NET framework, and stores data in XML 
files. Binary and Source Code are available for download. 

ScrewTurn Wiki is an open source wiki software written in asp.net(C#) , 
released under GPL. 

MindTouch Deki Wiki is a free open source wiki developed in C# on Mono and 
PHP, released under GPLv2 and LGPLv2. This is a platform independent wiki 
that is language agnostic. Meaning, it can be extended using any programming 
language that can be exposed via a Web service. The application consists of a 
services layer, where the majority of the application's logic is implemented in 
C#, and a presentation layer in PHP. This makes it possible to dramatically alter 
the user interface without digging too deeply into complex application code. 


Pascal-Based 


PasWiki is a CGI based wiki using FreePascal, but should also work on 
Delphi/Kylix. 


Perl-based 
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CorneliOS, a webOS, includes a wiki software that's also the engine for Galaxiki. 

Nodsphere, the engine for PlanetMath. 

PodWiki is a wiki which supports multiple markup languages, primarily Perl 

POD. 

Socialtext is an enterprise wiki and weblog partially derived from open-source 

Kwiki. Socialtext is available as a hosted service, a dedicated hardware 

appliance, or as free downloadable open-source software (and a VMware virtual 

appliance) from Sourceforge. 

TWiki is a structured wiki, typically used to run a project development space, a 

document management system, a knowledge base, or any other groupware tool. 

Also available as a VMware appliance. 

UseModWiki (Clifford Adams, 2000) is a clone of AtisWiki. 

o Oddmuse is a fork of UseModWiki. 

o PurpleWiki is major rewrite of UseModWiki that implements Purple 
Numbers and Transclusion. 

WikiWikiWeb (Ward Cunningham, 1994) 


While not strictly Wiki software, weblog-engine Blosxom mostly meets the definition when 
used with its wikieditish and wikiwordish plugins. There are also plugins available that enable 
Blosxom to use the text parsers from Kwiki, TWiki, or PurpleWiki. 


PHP-based 


Please note that many of Wikis below DO NOT support international characters - this is the 
case of WikkaWiki and other forks of Wakka, PhpWiki, easyWiki and others. 


GetWiki is a highly modified version of version 1.1.0 of MediaWiki 

MediaWiki was custom-designed for the high-volume Wikipedia encyclopedia 

project; it is also used for all other projects run by the Wikimedia Foundation, 

which operates Wikipedia. It is written in PHP and uses a MySQL or PostgreSQL 
database backend; however it is publicly available and suitable for tailoring to 
other applications. 

o IpbWiki inherits the functionality of MediaWiki and adds on integration 
layer to Invision Power Board on top of it (single sign on, forum content in 
wiki, article rating system, etc...) 

TigerWiki is a minimalist and simple wiki framework. 

TikiWiki is one of the larger and more ambitious wiki development projects, 

including a variety of additional groupware features (message forums, articles, 

etc.). 

DokuWiki is a simple-to-use Wiki aimed at the documentation needs of a small 

company. It uses plain text files and has a simple but powerful syntax which 

ensures the datafiles remain readable outside the Wiki. 

PhpWiki is a WikiWikiWeb clone in PHP. 

PmWiki is a PHP-based wiki. Features include: GPL-licensed, easy 

installation/customization, designed for collaborative authoring and 
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maintenance of web sites, and support for Internationalization. Does not 

require a database. 

PukiWiki is a PHP-based wiki (Japanese). Their site has not been fully translated 

into English. 

RabbitWiki is a very simple PHP flat text based wiki. It also has an option to use 

SQL. 

WakkaWiki is a PHP/MySQL-based lightweight wiki engine. Wakka is no longer 

maintained, but survives in a number of forks: 

o CitiWiki has been called the "Wiki of the next generation". 

o UniWakka is another fork of Wakka, aims at providing a collaborative 
authoring tool for scientific web content. It supports WikiFarms 
installations, MathML, footnotes, tables of contents, bibtex import and 
export, latex export, latex-like citations, OpenOffice export and more. 

o WackoWiki is a fork of Wakka, with many new features and multilingual 
interface. Shares several modules, developers and a bugtracker with an NP] 
engine. 

o WikkaWiki is a light, standards-compliant, configurable fork of Wakka with 
many improvements and new features (among which native support for 
Mindmaps). 

Wiclear is a simple PHP/MySQL-based lightweight wiki engine targeted at data 

organization and multiple languages. It also features customization via themes 

and extensibility via plugins. 


Python-based 


MoinMoin is a Wiki clone written in Python. Offers good access control based on 
user groups. 

OghamWiki is a WYSIWYG wiki designed for non technical users. 

PikiPiki is a Python-based Wiki. It is fairly basic, quick and simple to install, and 
offers reasonable security. 

PikiePikie another Python-based Wiki. It produces a Wiki that resembles a 
typical website, and allows visitors to choose which "skin" to view the site with. 
Sycamore offers web based administration and offers double clicking to edit 
specific sections of pages. 

TamTam 

Trac is a Wiki clone that integrates simple issue tracking and an interface to 
Subversion. 

Zwiki is a powerful Zope-based GPL wiki engine. It can integrate with the CMF 
content management framework and Plone, and supports several kinds of 
markup as well as WYSIWYG HTML editing. 

More at the Python wiki: 


Ruby-based 
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e Instiki is a Wiki clone written in Ruby. 

e Pimki is a PIM (Personal Information Manager) loosely based on Instiki's Wiki 
technology. 

e Ruwiki is a simple, extensible Wiki, supporting CGI and WEBrick interfaces, 
templates, CSS formatting, namespaces, and internationalisation. 

e Hiki is another Wiki clone written in Ruby, originating in Japan. 

e JunebugWiki is a clean, simple, and minimalist wiki built on the Camping 
microframework intended for personal use. 

e Metadot Wiki is the 7th edition of the popular Metadot Portal Server. The 
Metadot Wiki is a commercial-quality product that can be used as an intranet, 
extranet or wiki. Metadot has rewritten the product in ROR, adding complete 
wiki functionality. The Metadot Wiki is available as source code. The source 
code does not include the wiki-on-demand or management console 


Smalltalk 


e Swiki is written in Squeak, and considered to be "super-portable and easy to set 
up and use". It runs on common platforms, including Mac, Windows, *nix, as 
well as others. 

e SmallWiki is a wiki for VisualWorks, also ported to Squeak. Version 2 is called 
Pier is being written for Seaside, a web-design framework similar to Rails. 


Peer-To-Peer 


e Integrated into Code Co-op (a P2P version control system). 

e Kerika is a commercial Java application which combines whiteboarding and 
document management. It works on a peer-to-peer model and is available for 
Windows, Macs and Linux. Team members who are not using Kerika get their 
project updates sent to them automatically by email. 


PDA 


e AcroWiki is a commercial editing application with wiki-like syntax for PalmOS. 
It stores the notes as Memos (in a separate category) so they can be opened on 
the desktop machines and exported to an online wiki. 


VBScript-based 
e OpenWiki is written in VBScript, uses the ASP protocol, and stores data in XML 
files or Microsoft SQL Server. It combines useful features of several Wiki 


engines, particularly UseMod and MoinMoin, with Windows Integrated 
Authentication so users are logged in transparently. 


Miscellaneous 
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e Apple's HyperCard was a direct inspiration for Ward Cunningham's original 


Wiki software. 


e OpenTeams is a commercial web-hosted wiki with blogging, tags, discussion, file 
attachment, and SSL encryption functionality in a simple, email-like, 3-pane, 


Web 2.0/Ajax interface. 


e Techwiki claims to be a Wiki optimized for writing technical stuff - 


mathematics, equations and the like. 


e WikiServer is a self-contained, easy-to-install wiki written in C++. The current 
stable release of the latest version runs on Windows and Linux. 
e Ikiwiki was a wiki compiler written perl by Joey Hess 


See also 


e Comparison of wiki software 


Comparison of wiki software 


The following tables compare general and technical information for a number of wiki 


software packages. 


General Information 


Creator Firs Latest 
t stable 
publ release 
ic 
rele 
ase 
date 
BrainKeep BrainKeep 2006 
er er, Inc. -04- 
19 
Central Central 2005 Winter 
Desktop Desktop’ -10- Release 
Inc. Ol ‘06 
Clearspace Jive 2007 1.2 
Software -02- 
07 


Stab Predeces Cost (USD) Softwar Program Data 


le sor 
rele 

ase 

date 


2007 N/A 
-()2- 
15 


2006 N/A 
-0)9- 
12 


2007 N/A 
-06- 
25 
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e ming backend 
licence!! language 


] 


Free Propriet PHP RDBMS 
Version ary 

Available, 

Subscriptio 

ns start at 

$35/month 

Free = VerPropriet PHP PostgreS 
Available, ary QL 
Paid plans 

Start at 

$25/month 

Free for 5 J2EE/Jav SQL/LD 
users, a AP 
additional includin 
users g: 


Confluenc Atlassian 20042.5.4 2007 N/A 
e Software -05- -06- 
Systems 05 12 
Corendal Thierry 2005 2007 N/A 
Wiki Danard -05- -02- 
14 12 
coWiki Daniel T.20020.3.4 2005 N/A 
Gorski; -03 NO -02- 
Paul LONGE 24 
Hanchett R 
SUPPOR 
TED 
DekiWiki MindTouc 2006 2006.07.2 2006 MediaWi 
h, -07- 5 -O7- ki 
OpenGarde25 25 
n.org 
DidiWiki Matthew ? 0.5 2004 N/A 
Allum -09- 
30 


DokuWiki Andreas 
Gohr 
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$29/user/ye 
ar 


Mys 
QL, 
Oracle, 
Postgres, 
IBM 
DB2 and 
SQL 
Server 


Free (openCommer Java, JavaRelation 


source cial, EE 
project orAcadem 
community ic, 

or Commu 
personal), nity, 
$1200—8000 Personal 
commercial,, Open 
$600-4000 Source 
academic Project 

Free GPL Java 
Free GPL PHP5 
Free andGPL PHP, 
paid hosted MONO 
versions. 

Free open 

source 

version 

available for 

download. 

Paid 

enterprise 

versions 

available as 

well 

Free GPL C 
Free GPL v2 PHP 


al DB 
(Postgre 
SQL, 
MySQL, 
Oracle, 
DB2, 

MS SQL 
Server) 


MySQL 
or Oracle 


MySQL 


MySQL 


File 
system 


File 
system 


EditMe _— EditMe, 
LLC 


FlexWiki David 
Ornstein 


Instiki David 
Heinemeie 
r Hansson 
Creator 


JotSpot = JotSpot 
JSPWiki Janne 
Jalkanen 


Kerika Kerika, 
Inc. 


MediaWik Magnus 

i Manske; 
Various 

Midgard Henri 

Wiki Bergius 

MoinMoin Jiirgen 
Hermann; 
Thomas 


Waldmann 


Near-Time Near- 
Time, Inc. 
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2002 2007.06.2 2007 N/A 
-08- 5 -06- 
01 25 
? 2005 N/A 
“iO. 
02 
? 0.11.0 2006 N/A 
-03- 
19 
Firs Latest Stab Predeces 
t stable le sor 
publrelease _ rele 
ic ase 
rele date 
ase 
date 
2004 2.0 2006 N/A 
-10 -07- 
24 
20012.4.100 2007N/A 
-07- -03- 
06 02 
2005 1.1 2007 N/A 
-05- -05- 
01 01 
2002 1.10.0 2007 N/A 
-O1- -05- 
25 09 
2004 1.8.3 2007 N/A 
-09- -04- 
29 25 
2000 1.5.7 2007 N/A 
-07- -02- 
28 07 
2005 2007.28.0 2007 N/A 
-11- -27- 
20 06 
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From $5/mo Propriet Java 
ary (Hosted) 


MySQL 


Free FlexWik ASP.NET File 
i Shared, C# system, 
Source SQL 
Licensin Server 
g 
Program 

Free Ruby Ruby Active 
License! Record 
2] 

Cost (USD) Softwar Program Data 


e ming 
licence language 


From Propriet Java 
$100/mo ary 
Free LGPL Java 


$9.95/mont Propriet Java 


h/user; freeary 

for anyone 

with ".edu" 

email 

address 

Free GPL v2 PHP 
Free LGPL PHP 
Free GPL Python 


backend 


File 
system, 
XML 
Flat-file, 
RCS, 
SVN 
Object- 
oriented 
database 


MySQL, 
PostgreS 
QL 

MySQL 
and RCS 


Flat-file 


Starts atPropriet Ruby onmySQL 


$24.99 perary Rails 
month for 
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unlimited 
spaces and 
unlimited 
users 
OpenWiki 2? 0.78 spl 2002N/A Free 
-03- 
19 
OpenLink z4 Free GPL 
Wiki and 
Commer 
cial 
License 
PAUX From Free and 
€300/mo Commer 
cial 
License 
Perspectiv Alan Slater 2004 0.922 2007 N/A Free GPL 
e -03 -02- 
06 
PhpWiki Steve 1999 1.3.14 2007 WikiWiki Free GPL 
Wainstead, -12 -07- Web 
01 
PmWiki Patrick 2002 2.1.27 2006 N/A Free GPL 
Michaud — -02- -12- 
06 11 
Socialtext Socialtext First 2.9.5.2 June 2.9.5.1 Pricing Open- 
publ 2007 varies. source 
ic Available and 
relea deployment commer 
se in options cial 
2003 include: free licensin 
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open-source g 
and options 
VMware 

image 

downloads, 


ASP, MS 

VBScript Access 
and SQL 
Server 


ASP.NET Microsof 
; C#H,t 

XSLT Indexing 
Service, 
XML 
Berkeley 
DB, 
Flat-file, 
MySQL, 
PostgreS 
QL, SQL 
Server, 
Oracle 8, 
Firebird 
Flat-file 


PHP 


PHP 


Perl PostgreS 


QL 


Creator 


Swiki Mark 
Guzdial; 
Jochen 
Rick 


Sycamore Philip 


Neustrom 
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Firs Latest 
t stable 
publ release 
ic 

rele 

ase 

date 

1999 1.5 

-10 


2006 1.0 
-05 


ThoughtFa ThoughtFa May 2.2.8 


rmer rmer 


2006 


TigerWiki Chabel.org Mid-2.19 


TikiWiki Luis 


Argerich 


2005 


2002 1.9.7 
-10- 


(200+ devs09 


nowadays) 


TWiki Peter 


Thoeny 


UseModW Clifford 
iki Adams 


WackoWi Roman 
ki Ivanov 


David 
Jobet 


Wiclear 


Wikiwig 
WikkaWik Wikka 


1998 4.1.2 
-10 


2000 1.0 
-01- 

22 

2003 R4.2 
-03 


20040.10 
-06- 
25 


2004 1.1.6.3 


Stab Predeces Cost (USD) Softwar Program Data 


hosted 
services and 
dedicated 
appliances. 

le sor 

rele 

ase 

date 

2005 N/A Free 

-12- 

06 

2006 MoinMoi Free 

-05- n 

17 

2007 N/A From 

-07- $35,000 

31 

2006roWiki Free 

-11- 

02 

2006 N/A Free 

-11- 

30 

2007 JosWiki Free 

-0)3- 

03 


2003 AtisWiki Free 
-09- 

12 

2005 WakkaWi 
-03- ki 

oH | 

2006 N/A 

-03- 

26 


Free 


2007 WakkaWiFree 
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e ming 
licence language 
] 


backend 


GPL Squeak File 
system 
GPL Python _- PostgreS 
QL 
Propriet ASP.NET SQL 
ary , C# Server 
GPL PHP File 
system 
GPL PHP PostgreS 
QL, 
MySQL, 
Oracle, 
AdoDB 
GPL Perl Flat-file, 
RCS, 
pluggabl 
e storage 
backend 
GPL Perl Flat-file 
PHP MySQL 
GPL PHP MySQL 
GPL PHP MySQL 
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i Developme-05- -05- ki 
ntTeam 16 07 
WikkiTikk Tavi 2? 0.26 2 N/A Free GPL PHP 
iTavi NO 
LONGER 
DEVELO 
PED 
XWiki Ludovic 20041.0 2007 N/A Free GPL/LG Java 
Dubost -02 -05- PL 
22 
ZWiki Simon 1999 0.59.0 2007 N/A Free GPL Python 
Michael = -11- -04- 
05 02 
Creator Firs Latest Stab Predeces Cost (USD) Softwar Program 
t stable le sor e ming 
publrelease _ rele licence language 
ic ase 
rele date 
ase 
date 
Target audience 
Wiki software Public Private Corporate/Enterprise Education Intranet 
BrainKeeper Yes Yes _ Yes Yes ? 
Central DesktopYes Yes Yes Yes Yes 
Clearspace Yes Yes Yes Yes Yes 
Confluence Yes Yes Yes Yes 
Corendal Wiki Yes 
coWiki Yes Yes Yes 
DekiWiki Yes Yes Yes 
DidiWiki 
EditMe Yes Yes Yes Yes Yes 
FlexWiki Yes Yes 
Instiki Yes Yes 
Wiki software Public Private Corporate/Enterprise Education Intranet 
JotSpot Yes Yes Yes 
JSPWiki Yes Yes Yes 
Kerika Yes Yes Yes Yes 


MediaWiki Yes Yes Yes 
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MySQL 


PostgreS 
QL, 
MySQL, 
Oracle, 
Apache 
Derby, 
HSQLD 
B 

ZODB 
Object 
Database 
Data 
backend 
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Yes 


Yes 


Yes 


Yes 
Yes 
Yes 
Yes 


Yes 


Midgard Wiki Yes Yes _ Yes 

MoinMoin Yes Yes Yes 

Near-Time Yes Yes Yes Yes 
OpenWiki Yes Yes Yes 

Perspective 

PhpWiki Yes Yes Yes Yes 
PmWiki Yes Yes 

Socialtext Yes Yes Yes Yes 
Wiki software Public Private Corporate/Enterprise Education Intranet 
Swiki Yes Yes Yes 
Sycamore Yes Yes 

ThoughtFarmer Yes 

Tiger Wiki Yes Yes Yes Yes 
TikiWiki Yes Yes Yes Yes 
TWiki Yes Yes Yes Yes 
UseModWiki Yes Yes _ Yes Yes 
WackoWiki Yes Yes Yes Yes 
Wiclear Yes Yes 

Wikiwig Yes Yes Yes Yes 
WikkaWiki Yes Yes Yes Yes 
WikkiTikkiTavi Yes 

XWiki Yes Yes Yes Yes 
ZWiki Yes Yes Yes Yes 


Wiki software Public Private Corporate/Enterprise Education Intranet 


Features 1 
File Spam Page Inline 
uploading prevention access HTML! 
control!4! 
attachme 
nts 
BrainKeeper Yes Yes, CAPTCHA No Yes 
Central Yes Yes, CAPTCHA No Yes 
Desktop 
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User- Document 
customiza renaming 
ble 

interface!® 


Yes, Yes 
templates, 
html, 

custom 
navigation 
portal, 

custom 

data fields 
Yes, No 
templates 

and 

themes, 

html and 


Clearspace 
Confluence 


Corendal 
Wiki 


coWiki 


DekiWiki 
DidiWiki 


DokuWiki 


EditMe 


FlexWiki 


Instiki 


JotSpot 
JSPWiki 


Kerika 
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Yes Yes, CAPTCHA Yes optional 
Yes No Yes Yes 
No No Yes, dueNo 
to UNIX- 
style 
permissio 
ns 
Yes Yes Yes optional 
No No No escaped 
Yes Yes, blacklist Yes, optional 
optional 
Yes No, blacklist Yes, Yes 
optional 
No ? No? No 
Yes No No Yes 
File Spam Page Inline 
uploading prevention access HTML! 
/ control!7! 
attachme 
nts 
Yes No Yes Yes, optional 
Yes Yes, content by Yes Yes, optional 
regexp 
Yes Yes, ACL  forYes No 
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CSS 


Yes, 
templates 
and themes 
Partial - Yes 

Velocity 

templates + 

CSS 

Partial -Yes, all pages 
hand-edited are seamlessly 
templates, updated 
document 

"Print 

version" 

Yes Yes 

Partial -rename the 
style-sheets document file 
Partial - Yes, plugin 
CSS, PHP; 

heavily 
documente 
d PHP API 
Partial 
CSS, 
XSLT, user 
editable 
navigation 
Partial 
style- 
sheets, 
templates, 
WikiTalk 
Partial 
CSS 
User- Document 
customiza renaming 
ble 

interface!”! 


Yes, links are 
updated 


- Yes 


-Yes, old page 
becomes a 
redirect 


-No 


Partial - Yes 

CSS 

Partial -Patch available 
templates 

Partial - Yes, full 


MediaWiki 


Midgard 
Wiki 


MoinMoin 


Near-Time 


OpenWiki 
OpenLink 
Wiki 


PAUX 


Perspective 


PhpWiki 


PmWiki 


Yes 


Yes 


Yes 


Yes 


Yes 
Yes 


Yes 


Yes 


Yes 


Yes 
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each project 


Yes, URL Yes 
blacklist,"° word 
blacklist,"!! IP 
blocking, 

captchas (as used 

on wikinews) 


No Yes 


Yes, BadContent Yes 


filtering via 

Regular 

Expressions 

Yes Yes 

Yes Yes 

Yes Yes 

Yes Yes 
Yes 
) 

Yes, CPAN Yes 

Blog::SpamAssa 

ssin 


Yes, word/URL Yes 
block (addon, 
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Yes 


Yes 


Yes 


Yes 


module 


customizab document 

le toolbar management 
Partial -Yes, old page 
many becomes a 
features areredirect 

user- 

customizab 

le, 

templates!'? 

] 

Partial No 

templates, 

CSS 

Yes -Yes, old page 
Themes, can be a 
templates, redirect 

CSS, 

XSLT, user 

editable 

navigation 

Yes, Yes 
templates, 

html, 

custom 

navigation 

portal, 

custom 

data fields, 

CSS 


templates, Yes 
skins, user 
Cao. 

XSLT 
templates, Yes 
skins, user 
CSS 


(perparameterized XSLT, Yes, with 
collection Raw Includes,CSS 
in-page XSLT 
plugin 


aliases 


themes; un- Yes 
documente 

d 

themes, Template:Mod 
user CSS,ule 
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auto-upd. 
database), URL- 
approval 
(option), encrypt 
e-mails (addon), 


nofollow, 
Captcha (addon) 
Socialtext Yes Yes Yes 
File Spam Page 
uploading prevention access 
/ control!5 
attachme ] 
nts 
Swiki Yes Yes, block IPYes 
Addresses, 
words, UserIDs 
Sycamore Yes Yes, block IPYes 
Addresses, 
UserIDs, hidden 
email addresses, 
CAPTCHA 
ThoughtFar Yes N/A Yes 
mer 
TigerWiki No No, to be coded Yes, 
password 
protection 
TikiWiki Yes Yes Yes 
TWiki Yes Yes, IP blacklist, Yes 
content by 
regexp, excessive 
activities, 
scripted 
registration 
prevention 
(plugin), hidden 
e-mail addresses 
UseModWiki Yes? ? Yes? 
WackoWiki Yes Yes, referrer Yes 
blacklist 
Wiclear Yes Yes, blacklist Yes, 
captcha available 
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Yes 


Inline 
HTML"4! 


Yes 


No 


Yes 


Yes? 


Yes 


Yes Yes 

User- Document 
customiza renaming 
ble 


interface!5 
] 


For Yes, updating 
AniAniWe all backlinks 
bs, CSS 

user CSS,Yes, old page 
themes redirects 

Yes Yes 

Partial -No 

style- 

sheets, 

template 

themes, Yes 

user CSS, 

modules 

templates, Yes, fixing 
skins, userbacklinks 
CSS 

CSS ? 

- themes, Yes 


configurable/s style-sheets 


afe 
Yes 


themes, Yes 
style-sheets 
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for plugin 
too 
Wikiwig 
WikkaWiki Yes Yes, referrer Yes Yes -style-sheets No 
blacklist configurable/s 
(module), HTTP afe 
REQUEST 
analysis 
(module) 
WikkiTikkiT Yes Yes, Captcha No Yes -style-sheets No 
avi configurable/s 
afe 
XWiki Yes Yes Yes style- Yes 
sheets, 
templates, 
themes 
ZWiki Yes Yes Yes Yes style- Yes 
sheets, 
templates, 
skins 
File Spam Page Inline User- Document 
uploading prevention access HTML"! — customiza renaming 
/ control!!6 ble 
attachme ] interface!!® 
nts ] 
Features 2 
WYSIWYG Web feeds Extensibility Other Notable 
page editing features 
BrainKeeper Yes Yes, RSS Enterprise-level 
Search, Interactive 
Dashboard, Tagging, 
Content Workflow, 
Custom Data Fields, 
Simple but 
Comprehensive 
Administration 
Central Yes Yes, RSS Block architecture, 
Desktop Calendars, 
Discussions, 
Database feature, 
Tasks and Milestones 
Blocks available 
Clearspace 
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Confluence Yes Yes, RSS plugins writtenEnterprise level 
in Java security, Simple 
admin interface, 
Open plugin API, 
Tagging, Fully 
supported 
Corendal Wiki Yes No patch For Corporate 


mechanism environments. 
Articles _ hierarchy. 


Integrated with 
Active Directory 
users and groups, 
NTLM 
authentication, 
Workflow, Revisions, 
Subscriptions, 


WYSIWYG _ editor, 
Access Control (Read 


and Write) 
coWiki ? Yes, RSS[19] plugins via anData saved hierarchic 
OOP interface (web trees) and in 
XML, per-page 
threaded comments, 
comfortable 
installation wizard 
DekiWiki Yes Yes, RSS, XML, JSONAPI,  ServiceHighly usable, stores 
Oriented in XML, SOA, API, 
Architecture enterprise arch and 
business focused, 


fully supported by 
MindTouch company 
website 

DidiWiki No No extremely fast and 
requires very little 
RAM, built-in web 
server 

DokuWiki YesQuickbuttons Yes, RSS/Atom[20] custom regexps Section Editing, 
XHTML-Compliant, 
nice tables, side-by- 


side diff, 
namespaces, 
Interwiki 

EditMe Yes Yes, RSS javascript Multi-level access 
controls, hosted 


solution, side-by-side 
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FlexWiki 


Instiki 


JotSpot 


JSPWiki 


Kerika 


MediaWiki 


Midgard Wiki 


MoinMoin 
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No Yes, RSS 

No 

WYSIWYG Web feeds 

page editing 

Yes Yes, RSS, per page 


Partial, alpha Yes, RSS, per page 
Yes, 
graphical 
features 


fullNo 
Wiki 


Partial, 
MediaWiki 
experimental 
version 
EnotifWiki + 
FCKeditor 


special Yes, RSS/Atom 
1.5.3 


No Yes, RSS, all changes 


Yes, v1.5+ 


239 


Yes, RSS, last changes 


diff, 
notifications 
Forms, Scripting 


Email 


ASP, 
WikiTalk, .Net 
Reflection 
plugins 
fairly __ trivialTrivial to set up on 
with minimalany platform _ that 
Ruby supports Ruby. Can 
knowledge use different 
(administrator-set) 
markup _ languages. 
Supports inline 
HTML, Textile, 
Markdown, and 
RDoc. 
Other 
features 
plugins, server-Support for  server- 
side Javascript side Javascript, 
VMWare virtual 
appliance 
plugins, filters, Weblog integration 
providers 
not yet 


Extensibility Notable 


Graphical wiki with 
hybrid _ peer-to-peer 
networking, can send 
project updates by 
email to non-users, 
can create private 
networks 

Per-article discussion 
page, watchlist, 
searching, email 
notification (built in, 
but currently disabled 
on Wikipedia) 

CMS integration 


actions, 
handlers 


PHP 
component 
architecture 
different plugin Enterprise 
types security, 
authentication, 
authorization, 
different 


level 


and 


parsers, 
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Email notification, 
additional desktop 
edition, XML-RPC 
content 
synchronization, 
searching in 
attachments (v1.6+). 
Near-Time Yes Yes, RSS in/out API Enterprise level 
security, Simple 
admin interface, 
Categorizing and 
Tagging, Analytics, 
Premium — Business 


Engine, Fully 
supported 
OpenWiki No 
OpenLink Wiki Yes Yes, RSS/Atom/RDF _ plugins, Based on WebDAV 
Virtuoso can be maintained by 
hosted PHPany WebDAV user 
integration agent, LDAP 
Integration, 
OpenLink Data 


Space Integration, 
Publish wiki in other 
formats (Docbook 
and HTML), Wiki 
Cluster and Tag 
support, WYSIWIG 


Editor, Revisions, 
Access Control List 
PAUX Yes Yes, RSS/Atom/RDF actions Individualized 
publication of 
dynamic content, 
which contains 
reusable semantic 


content objects. They 
are able to represent 
knowledge in its full 
complexity, and they 
make knowledge 
available as filterable 
content for Websites, 


semantic Wiki, 
detailed-evaluated 
eLearning and 
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Perspective 


PhpWiki 


PmWiki 


Socialtext 


Swiki 


Sycamore 
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Yes, IE, MozillaYes, RSS (last day's HTML 


& Firefox 


Template:Modul Yes 


e 


No 


Yes 


WYSIWYG 
page editing 
No 


ThoughtFarme Yes 


r 


changes) 


RSS/Atom/RDF: 
global, per page or per 
user 


Yes 

Yes, RSS feeds, 
Google/Technorati 
search results 

Web feeds 


Yes, RSS each page, 
bookmarks RSS 


Yes 
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individualized print 


media. 
Searching over MS 
Includes, Office documents, 


XSLT includescan search files in 

that can takefolders outside of the 

external URLWiki (including MS 

contents, .NetOffice files), Active 

action classes Directory integration, 
User authorisation 
based on AD Group 
membership, Private 
Content 


plugins support all databases 


very, 100+Highly sophisticated 

plugins support for 
customization. 
Practically cameleon- 
like 

using accessible —_ through 

REST/SOAP  REST/SOAP APIs 

APIs 

Extensibility Other Notable 
features 

Some Installs own 
webserver 


(Commanche) and 
can co-exist with IIS 


or Apache. 
AniAniWeb function 
provides owner 


control of granular 
access and creates 
access groups. Users 


tracked by email 
address. 
Javascript based 
quick edit, web based 
configuration, 
searching, built-in 
map function 
Plugins, APIStructured wiki, 
used by granular access 


Tiger Wiki 


TikiWiki 


TWiki 
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No 


Partial, 
visual 
insertion 


Yes, pre-installed Yes, RSS/Atom, with250+ plugins, Revision 


plugin 


just 
quicktag 


Yes, RSS with plugin 


Yes, RSS/Atom/RDF 


search string 
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ThoughtFarme controls, faceted 

r Professionalbrowsing, revision 

Services control, full text 
search, email 
notification, photo 
albums 

Code All minimal features 

modification — that a wiki need 

60+ plugins &ACLs, calendaring, 

mods charting, email 
notification, form 
handling and 
reporting, slideshow 
presentations, 
spreadsheet 
calculations and 
GRAPHS, mobile, 
GIS (MapServer), 
JGraphPad for 
drawings inside 
wikipages, 3d 
Browsing, 
Workspaces, 
Workflows, 
OpenOffice 
WYSIWYG editing 


of Tikiwiki content, 
Multitiki installations 


control, 
50+ addons,ACLs_ for _ topics, 
extendable tagging, _ blogging, 
through TWikicalendaring, charting, 
Plugin API global search & 
replace, email 
notification, form 
handling and 
reporting, platform to 
build wiki 
applications, 
graphing, slideshow 
presentations, 13 
translations, plotting, 
multistyle diffs, 
advanced searching, 
spreadsheet 
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calculations, 
WYSIWYG supports 
TML, available as a 
VMware appliance 
UseModWiki No? Yes 2 
WackoWiki Yes, WikiEdit actions, multilingual, 
handlers WYSIWYG-like 
editor, ACLs, 
subpages, comments, 
files, email 
notification, cloning 
of pages, installation 
wizard... 

Wiclear No Yes, RSS/Atom plugins hierarchical _ pages, 
linked _ translations, 
multilingual, ACLs, 
section editing, 
XHTML compliant, 
side-by-side diff, 
email notifications, 
admin panel, 
moderator/admin/use 
r 

Wikiwig 

WikkaWiki Yes, WikiEdit Yes, Singleactions, FreeMind support, 

page/Comments/Globa handlers ACLs, cloning of 

] pages, advanced 
syntax highlighting 
using GeSHi, on-the- 
fly downloading of 
codeblocks 

WikkiTikkiTav No bunch of 

i custom stuff in 

php 
XWiki Yes, using Yes, RSS plugins, Platform to build 
TinyMCE macros, scripts, wiki applications, 
applications Forms and Scripting, 
Multilingual, 
Database storage, 
Fulltext search, 
GraphViz, SVG, 
Freemind, Lucene, 
Charting, photo 
albums, 
presentations, 
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ZWiki 


Installation 


BrainKeeper 


Central 
Desktop 
Clearspace 
Confluence 


Corendal Wiki 


coWiki 
DekiWiki 
DidiWiki 


DokuWiki 
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blogging, calendar, e- 
mail plugin, virtual 


wikis, simple table 
computations, 
sortable tables, 
section editing, 
portlet integration, 
integrated _ statistics, 
XMLRPC API, Tags, 
PDF/RTF export 
Yes, using Epoz Yes, RSS/Atom all Zopehierarchical pages, 
plugins, revision control, 
LaTeX, andfulltext search, email 
more notification, issue 
tracker, LaTeX 
integration 
WYSIWYG Web feeds Extensibility Other Notable 
page editing features 
Runs onEase of installation"! Web-server Other software 
Windows required required 
Yes Hosted - no None None 
installation or setup 
required 
Yes Hosted - no _— setNone None 
requred 
Yes Easy, Configuration Tomcat included, Java 1.4 
Wizard or use your own 
servlet container. 
Yes Moderately simple Tomcat, OracleJava, MySQL or 
Application Oracle, Tomcat or 
Server Oracle Application 
Server 
Yes Apache with MySQL, PHP 
PHP 
Yes Hosted, | Moderately Apache Mono, MySQI, PHP 
simple, Appliance 
Yes, withsimple None (built-in) none 
Cygwin 
Yes Moderately simple Should work onPHP 


244 
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EditMe Yes None Required None Required None 

FlexWiki Yes Moderately simple IIS/Apache ASP.Net/Mono 

Instiki Yes Trivial None (built-in) Ruby 

Runs onEase of installation”?! Web-server Other software 
Windows required required 

JotSpot Yes Easy, VMWare virtual None (built-in) VMWare Player 
appliance 

JSPWiki Yes Moderately simple any servlet 2.3-Java 1.4 

compliant web 
server 

Kerika Yes Very easy (under 2None Java 1.5 
minutes) 

MediaWiki Yes Windows, Moderately simple Apache or ISMySQL, PHP5S 

Macs, Linux with PHP 5.0+ 
Midgard Wiki No Easy, requires root Apache withMySQL, PHP 
PHP 

MoinMoin Yes Easy for DesktopNone for Python 
version Desktop version 

Near-Time Yes Hosted: no set-up orNone None 
installation required 

OpenWiki Yes Easy US ASP 

OpenLink Wiki Yes Easy OpenLink OpenLink Virtuoso 

Virtuoso 

PAUX Yes Easy None (built-in) Java 

Perspective Yes Moderately simple IS -NET Runtime 1.1 

PhpWiki Yes Moderately simple Any Web-server PHP 

with PHP 
PmWiki Yes, with Very Easy Any Web-server PHP 
Apache+PHP5 with PHP, can 
or IIS+PHP4, run without a 
or standalone web server. 

Socialtext Yes hosted service withoutno, all requiredno, all required 
installation as ancomponents components included 
option; appliance included 
option available with 
managed service for 
remote upgrades and 
administration 

Runs onEase of installation?! Web-server Other software 
Windows required required 
Swiki Yes Very Easy: Just dragNone--installs | None 


the image file over theown server. Can 
(non-coexist with IIS 

theand Apache by 
on 


executable or 
GUI) just use 


image file as  anrunning 
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executable parameter alternate port 


Sycamore No Easy Apache Python (with Image 
Library), 
MySQL/PostgreSQL, 
Xapian 
ThoughtFarmer Yes Managed installation Windows ServerSQL Server 
2003 2000/2005 
Tiger Wiki No Easy Any Web-server PHP 
with PHP 
TikiWiki Yes Moderately simple viaAny Web-server PHP 
install script. with PHP 


Telnet/ssh is useful but 
not necessary. 


TWiki Yes Native install: Easy onAny Web-serverPerl, RCS 
*NIX, moderately with cgi support. 
difficult on Windows. Web-server 
Point and click installincluded in 
on Windows  withVMware 
VMware appliance appliance and 
and TWiki for TWiki for 
Windows Personal. § Windows 
Personal. 
UseModWiki Yes Easy? 4 Perl 
WackoWiki Yes Quick and easy,Apache  v1.3.xMySQL, PHP 
multilingual installer (v2 
recommended), 
IS 
Wiclear ? Easy, in 4 _— steps, Apache with MySQL, PHP 
update wizard PHP. IIS 
untested 
Wikiwig 
WikkaWiki Yes Easy Any Web-serverMySQL/PHP 


installation/upgrade, with PHP 
through web-wizard 


WikkiTikkiTavi ? Moderate Any Web-serverMySQL/PHP 
with PHP 
XWiki Yes Simple package forTomcat or Jetty Java 
Windows Available 
ZWiki Yes Moderate (easy  ifZope Python (included with 
Zope is already most Zope installs) 
installed) 
Runs onEase of installation! Web-server Other software 
Windows required required 
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“ Page Access Control: Some wiki engines allow (optional) read/write access 
restriction to users or user groups on a per-page basis (e.g. through Access 
control lists). 

4 Inline HTML Safe means that several features of HTML are restricted. This is 
better than Full, for security reasons. Users with complete access to HTML 
could, for example, create spoof forms to trick users. 

“ User-customizable interface: Many items have administrator hand-editable 
templates. Even items which use hard-coded templates could still be modified if 
the source is available. 

“ Licenses here are a summary, and are not taken to be complete statements of 
the licenses. Some packages may use libraries under different licenses. 

4 “Ease of installation” is expressed relative to other server software packages, 
and not to desktop applications. 

“4 MediaWiki Customizable interface - templates. 

“ Instiki - Ruby License Instiki is distributed under the same terms as Ruby 
itself, per the Ruby license. 

“ "Evidence that demonstrates that a business model or idea is feasible." 

“URL anding are available using a MediaWiki extension. 


Main article: Wiki software 


List of wiki software 
List of wikis 


WikiServer 


Developer: 


Eddie Edwards (Original), Ryan Norton (Current) 


Latest release: 


1.5.11c / May 18, 2005 


OS: 


Cross-platform 


Genre: 


WikiEngine, HTTP Server 
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License: 


Public Domain 


Website: 


WikiServer Homepage 


WikiServer is a WikiEngine that is completely self-contained - it includes its own HTTP 
server, and so does not require CGI support, Perl, or even a separate Web server such as Apache 
or IIS. As such, it is one of the easiest ways to install and run a wiki; even people without any 
knowledge about the system can set up a wikiserver without many problems. WikiServer is also 
sometimes used as a general term that is used to describe any WikiEngine with a built-in server. 


History 
Eddie's WikiServer 


The first WikiServer was developed by Eddie Edwards around 1997. It was continuously 
improved for few years until development was halted due to unknown reasons. 

Eddie's WikiServer borrowed most of its Wiki Syntax from Ward's Wiki, but it had a few of 
its own extensions: 

e 1: made a table, and increasing 1 made each consecutive row. You could do the 
same with 1;, only this made a table without a border 

Eddie's WikiServer (as it is called on WikiWikiWeb) was Windows-only and needed to be 
compiled using Microsoft Visual C++. It was a command line application that had its server 
settings configured with command line arguments. 


Further WikiServer development 


Eddie ceased development at version 1.1 (beta). Other people have continued the 
development: 
1. Michel Van De Wetering had finished a set of mods on EW 1.1, though still beta. 
This mod was long considered to be the latest version of the wikiserver: 
Wikiserver 1.2 (beta), but the development has been taken up by Ryan Norton 
and has now reached version 1.6 status (see below). 
2. Elrey Ronald V. modified Eddie's Wikiserver 1.0 for his simple group 
collaboration. In addition to 1.0 features, it had several new features. 


WxWikiServer 


Several years later, around late 2003, development was started again by a developer named 
Ryan Norton. Because he was a developer for the WxWidgets project, he decided it would be a 
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good idea to capitalize on his WxWidgets experience and make WikiServer with it. He named 
his fork WxWikiServer. 
WxWikiServer had several major differences between it and Eddie's WikiServer: 
e Itwas no longer Windows-only - it could run on all major platforms 
e It was no longer command-line driven... initial configuration was done through 
the WikiServer.ini file 
e It hada TaskBarlcon, from which you could shut it down, restart it and see the 
server log, all without having to open a web browser 
Eventually many features were added to WxWikiServer - such as full UseMod syntax 
capatability, improved security et al. 


Current WikiServer 


Until the 1.5.3 version, WxWikiServer was not as popular as the original WikiServer because 
it lacked a few features and was buggy, at least when used extensively on the Internet. Therefore 
some people decided to stay with the original WikiServer, or the 1.2 version. 

However, with the help of several determined users, WxWikiServer was eventually restored 
to the stability of the original WikiServer. Later, when it became clear that Eddie's WikiServer 
was no longer going to be developed, Ryan Norton renamed his WxWikiServer back to 
WikiServer to avoid confusion. His latest version is 1.6RC18 (beta). 

To this day the terms WxWikiServer and WikiServer are now used interchangeably by some 
people. 


See also 


e WikiSoftware 


MediaWiki 


The Main Page of the English Wikipedia. 
Maintainer: 

Wikimedia Foundation 
Stable release: 

1.10.1 (uly 13, 2007) p44 
Preview release: 


-(-) eA 


OS: 
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Cross-platform 
Use: 
Wiki 
License: 
GNU General Public License 
Website: 
http://www.mediawik1.org/ 


MediaWiki is a web-based wiki software application used by all projects of the Wikimedia 
Foundation, all wikis hosted by Wikia, and many other wikis, including some of the largest and 
most popular ones.[1] Originally developed to serve the needs of the free content Wikipedia 
encyclopedia, today it has also been deployed by companies as an internal knowledge 
management solution, and as a content management system. Notably, Novell uses it to operate 
several of its high traffic websites.'! 

MediaWiki is written in the PHP programming language, and can use either the MySQL or 
PostgreSQL relational database management system. Distributed under the terms of the GNU 
General Public License, MediaWiki is free software. 


History 


The current software was originally written for Wikipedia by Lee Daniel Crocker, based on 
the user interface design of German University of Cologne student and developer Magnus 
Manske. Wikipedia had originally used a small wiki engine called UseModWiki written in Perl. 
Wikipedia was later switched to Manske's PHP-based software to offer more functionality. 
Increasing usage caused load problems, leading Crocker to re-write the software with a more 
scalable MySQL database backend. Later, Brion Vibber would take up the role of release 
manager and most active developer.'! 

Since the release of the first version of Manske's script, the software had been given multiple 
nicknames representing the state of development—"the PHP script", "phase II", "phase III", "the 
new codebase"—but no product name. After the Wikimedia Foundation was announced on June 
20, 2003, the name "MediaWiki" was coined by Wikipedia contributor Daniel Mayer as a play 
on "Wikimedia," [4] and the name was gradually phased in beginning in August 2003. The name 
has frequently caused confusion due to its intentional similarity to the "Wikimedia" name (which 
itself is similar to "Wikipedia"). Nevertheless, "MediaWiki" has become a recognizable brand, 
with a Google search yielding over 178 million results on the name in July 2007. 

The product logo was created by Erik Moeller using a flower photograph taken by Florence 
Nibart-Devouard, and was originally submitted to an international logo contest for a new 
Wikipedia logo held in summer 2003.[5] The logo came in third place, and was chosen to 
represent MediaWiki instead of Wikipedia, with the second place logo used for the Wikimedia 
Foundation and the first place logo for Wikipedia itself.[6] The double square brackets around 
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the photo of a sunflower symbolize the syntax MediaWiki uses for creating hyperlinks to other 
wiki pages. 


Release history 


Version 


Date Notable changes 
number 


1.17 


1.288) 


1.301 


1.4091 


1.500 


1.602 


December 8, 
2003 


March 24, 
2004 


August 11, 
2004 


March 20, 
2005 


October 5, 
2005 


April 5, 
2006 


New wiki table syntax. 

User-editable interface messages through "MediaWiki 
namespace". 

XML-wrapped page source export with optional history. 
"Magic words" - special variables and_ parser 
instructions. 

Experimental web-based installer. 

Image resizing and thumbnail generation. 

Editing toolbar for learning wiki syntax. 

User rights management within the wiki. 

New, highly CSS-based default look and _ feel 
("MonoBook" skin) and _ better web — standards 
compliance. 

Parametrized templates. 

Category feature. 

Automatic merging of edit conflicts when possible. 
Improved installation. 

User interface language can be changed by the user. 
Significant performance improvements. 

Support for compressing old revisions of articles to 
reduce storage needs. 

Image gallery generation, list of recently uploaded 
images. 

SVG rasterization support (requires external support 
tools). 

Major database redesign decoupling text storage from 
revision tracking, resulting in: 

o Significant performance boosts for some operations. 
o Permalink functionality for all revisions. 

o Support for storing bulk data outside the database. 
Support for e-mail notification upon changes. 

Page content must be encoded in UTF-8. 

The account creation form has been separated from the 
user login form. 

Page protection/unprotection uses a new, expanded 
form. 

"Job queue" for background updates. 
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e Improved tracking of template usage. 
e Tracking of external link usage for more systematic anti- 
spam measures. 

e Template parameters can have default values. 

e MediaWiki 1.7 requires PHP 5 (5.1 recommended). PHP 
17a July 7, 2006 4 is no longer supported. 

e Deleted files can now be restored. 

e Full support for PostgreSQL (8.1 or better) database 


backend 
e Support for DjVu thumbnailing and multipage navigation 
1.8041 October 10, e Various improvements to user blocking; blocks can be 
; 2006 placed only on unregistered users using a particular IP 
address 
e Uploading files from publicly accessible URLs is possible 
if enabled 
e "Undo revision" feature 
e Various improvements to blocking and special page 
1.9u5 January 10, caching 
: 2007 e Tables with sortable columns 
e Addition of an edit counter field to the user database 
e Revision size displayed on watchlists and recent changes 
e "Cascading protection" feature 
e Improved tooltips and accesskey feature 
1.106 May 9, 2007 e Various improvements to blocking and special page 


caching 
e IPv6 support 


Key features 


MediaWiki provides a rich core feature set and a mechanism to attach extensions to provide 
additional functionality. Due to the strong emphasis on multilinguality in the Wikimedia 
projects, internationalization has received significant attention by developers. The user interface 
has been fully or partially translated into more than 70 languages, and can be further customized 
by site administrators (the entire interface is editable through the wiki). Because Wikipedia is 
one of the world's largest websites, achieving scalability through multiple layers of caching and 
database replication has also been a major concern for developers. Wikipedia and other 
Wikimedia projects continue to define a large part of the requirement set for MediaWiki. 

One of the earliest differences between MediaWiki (and its predecessor, UseModWiki) and 
other wiki engines was the use of "free links" instead of CamelCase. Where, in a typical wiki, 
text like "WorldWideWeb" would have to be typed to create a link to a page about the World 
Wide Web, links in MediaWiki are created by surrounding words with double square brackets, 
and any spaces between them are left intact, e.g. [[World Wide Web]]. This change was logical 
for the purpose of creating an encyclopedia, where accuracy in titles is very important. 
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To make editing long pages such as comprehensive Wikipedia articles easier, MediaWiki 
supports editing only a small subsection of a page (as identified by its header). 


Rich content 


MediaWiki supports rich content generated through specialized syntax. For example, the 
software comes with support for rendering mathematical formulas using LaTeX and a special 
parser written in OCaml. Similar functionality for other content, ranging from graphical 
timelines over mathematical plotting and musical scores to Egyptian hieroglyphs, is available in 
the form of extensions. 

As the name MediaWiki suggests, the software has become ever more powerful at dealing 
with a wide variety of uploaded media files. Its richest functionality is in the area of images, 
where image galleries and thumbnails can be generated with relative ease if the software is set up 
correctly. There is also support for Exif metadata. The use of MediaWiki to operate the 
Wikimedia Commons, one of the largest free content media archives, has driven the need for 
further functionality in this area. 

MediaWiki currently provides no native WYSIWYG support, though it does come with a 
graphical toolbar for simplifying the process of learning the wiki syntax. It also has a simple 
interface to allow the transparent use of external editors for uploaded files and wiki pages. 


Organization 


MediaWiki provides many features beyond hyperlinks for structuring content. One of the 
earliest features is namespaces. One problem for Wikipedia had long been the separation of 
encyclopedic content from discussions surrounding it, as well as personal pages about 
encyclopedia editors. Namespaces are prefixes before a page title (like "User:" or "Talk:") which 
allow a page to exist under multiple names, but serving different purposes depending on their 
prefix. For instance, a page "[[The Terminator]]" could describe the 1984 movie starring Arnold 
Schwarzenegger, while a page "[[User:The Terminator]]" could be a profile describing a user 
who chooses this name as a pseudonym. More commonly, each page has an associated "Talk:" 
page which can be used to discuss its contents. 

Namespaces can be viewed as folders which separate different basic types of information or 
functionality. While new namespaces can be added, the number of namespaces in a wiki is 
typically relatively low. 

In addition to namespaces, pages can be structured using subpages. This simple feature 
provides automatic backlinks from a page of the pattern [[Page title/Subpage title]] to the 
component before the slash (in this case, "Page title"). 

MediaWiki supports user-created categories. These are similar to tags used in many web 
applications, but hierarchical and descriptive. In large wikis like Wikipedia, very complex 
hierarchies have grown using this system without any central planning.'"”! 


Customisation 
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Users can configure custom JavaScript that is executed on every pageview. This has led to 
JavaScript tools that users can "install", the "navigation popup" tool shown here displays a 
small preview of an article when hovering over a link title. 

If the feature is enabled, users can customize their stylesheets and configure client-side 
JavaScript to be executed with every pageview. On Wikipedia, this has led to a large number of 
additional tools and helpers developed through the wiki and shared among users. For instance, 
Lupin's navigation popups is a custom JavaScript tool that shows previews of articles when the 
user hovers over links, and also provides shortcuts for common maintenance tasks.!'*! Another 
example is wikEd, a full-featured MediaWiki-integrated text editor that provides syntax 
highlighting and search and replace functions. ''°! 

The entire MediaWiki user interface can be edited through the wiki itself by users with the 
necessary permissions (typically so-called "administrators"). This is done through a special 
namespace with the prefix "MediaWiki:", where each page title identifies a particular user 
interface message. The "MediaWiki:" namespace was also originally used for creating custom 
text blocks that could then be dynamically loaded into other pages using a special syntax. This 
content was later moved into its own namespace, "Template:". 

Templates are text blocks which can be dynamically loaded inside another page whenever 
that page is requested. The template "tag" is simply a special link in double curly brackets (for 
example "{ {disputed}}") which calls the template (in this case located at Template:Disputed) to 
load where the tag is. Templates support parameters, so that parts of the text can be substituted 
for each specific use case. A related method, called template substitution (called by adding subst: 
at the beginning of a template tag) inserts (like a copy and paste operation) the contents of the 
template into the target page, instead of loading the template contents dynamically whenever the 
page is loaded. This limits the consistency of using templates, but may be useful in certain cases, 
and is (perhaps) less demanding on the server. 

Templates have found many different uses, such as: 

e Identifying problems with a Wikipedia article by putting a template in the 
article. This template will then output a graphical box stating that the article is 
disputed, and also categorize it so that articles of this nature can be located. 

e Creating complex table layouts which are used consistently across multiple 
pages, and where only the content of the tables gets inserted using template 
parameters. 

e Sending users standard messages when they are blocked from editing, when 
their behavior is considered inappropriate, and so on. 

MediaWiki has the user interface in different languages. A language for the wiki content 
itself can also be set, but multilingual wiki content is not supported, i.e. one cannot choose the 
ISO 639 code mul for multilingual content. 

The MediaWiki codebase contains various "hooks" where additional code can be attached. 
This allows programmers to write extensions without modifying the core or having to submit 
their code for review. Installing an extension typically consists of adding a line to the 
configuration file, though in some cases additional changes such as database updates are 
required. 

Many of the available extensions are simple scripts to allow embedding content such as 
Adobe Flash files or HTML forms. Others add complex new behavior to the wiki syntax, such as 
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Semantic MediaWiki which provides the ability to add structured and searchable relations and 
attributes to wiki pages (cf. semantic web). The Wikimedia Foundation operates a Subversion 
server where many extensions are hosted, and a directory of them can be found on the 
MediaWiki website.”*! 


Access and groups 


While MediaWiki comes with a basic set of features related to restricting access and defining 
user groups, page access control does not tend to be given high priority in development. For 
instance, it is not possible to define the access permissions to pages on a per-namespace basis. 
Here, wikis like TWiki and MoinMoin provide more power by supporting advanced security 
mechanisms like Access Control Lists. 


Performance 


Because it is used to run one of the highest traffic sites on the World Wide Web, Wikipedia, 
MediaWiki performance and scalability have been highly optimized. MediaWiki supports Squid 
caches, load balanced database replication, client-side caching, memcached or table-based 
caching for frequently accessed processing or query results, a simple static file cache, feature- 
reduced operation, revision compression, and a job queue for database operations. 

MediaWiki may be overdimensioned for small-scale usage, where its large feature set and 
use of a database backend take a performance toll. On the other hand, the software is suitable for 
the operation of large scale wiki farms, such as the Wikimedia project and language family. 
However, MediaWiki comes with no built-in functionality to manage such installations. 


Limitations 


Aside from the aforementioned lack of WYSIWYG features, user documentation is found 
online only (there is no printed manual) and is not proprietary (dozens of authors have 
contributed to the documentation). The installation and usage of the Mediawiki software is not 
intuitive for inexperienced computer users (compared to purpose made software such as 
FrontPage for example). 

There is no complete MediaWiki syntax definition — the syntax is defined by the parser PHP 
code, and an Extended Backus—Naur form (EBNF) syntax definition may not be possible. This 
has caused difficulties in creating a WYSIWYG editor or complete ports of the parser to other 
languages. 

Furthermore, there is no offline version of MediaWiki, which would enable users to update 
pages on their client offline, and then have those pages automatically transferred to the server 
when re-connected, in a similar way to the client-server operation of Lotus Notes. An offline 
MediaWiki client could also help users with the back-up of important pages. 

Table operations could be improved. While the syntax for creating tables is not difficult for 
most IT-literate people to learn, it is offputting for the wider audience of subject-matter experts 
that Wikipedia would want to update its pages. Adding and removing columns from a table is 
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tedious for all users. Calculations on data in table cells are also time-consuming to set up, 
compared to any spreadsheet. 


TikiWiki 
Latest release: 


1.9.7 / December 6, 2006 


Available language(s): 


Multilingual (14) 


Genre: 


Content Management System (CMS) / Geospatial Content Management System (GeoCMS) / 
Groupware web application 


License: 


LGPL 


Website: 
TikiWiki.org 


Tiki CMS/Groupware, originally and more commonly known as TikiWiki, is an open 
source (LGPL) Content Management System (CMS) / Geospatial Content Management System 
(GeoCMS) / Groupware web application enabling websites and portals on the internet and on 
intranets and extranets. TikiWiki is a customizable modular multi-feature package; each 
component can be enabled / disabled and customized by the TikiWiki administrator. TikiWiki 
extends the customization to the user with selectable skins / themes 

The project is hosted on SourceForge, and was the July 2003 Project of the Month. 


Major Components 


TikiWiki provides components in these general categories: 

e Content Creation and Management Tools These features enable 
administrators and users to create, display, and maintain online-accessible 
content of all kinds, including text, binary data, images, links, and much more. 

e Content Organization Tools and Navigation Aids These features enable 
administrators and users to organize, structure, and present content in a 
manageable way. 
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e Communication Tools These features enable administrators and users to 
communicate with each other (and the outside world) by means of numerous 
media, including internal messaging, voice telephony, data interchange, and 
many more. 

e Configuration Tools and Administration Tools These features enable 
administrators (and users who have been given the appropriate permissions) to 
configure and administer all aspects of a TikiWiki site. 

In addition, TikiWiki allows each user to choose from several different pre-defined visual 
styles/themes/skins. These themes are implemented using CSS and the open source Smarty 
template engine (which is included in the TikiWiki installation). Additional themes can be 
created by the TikiWiki administrator for branding or customizing the look of their website, but 
requires familiarity with CSS and Smarty. 

For more information, see the Features page at doc.tikiwiki.org. 


Content Creation and Management Tools 


e Articles — fast-breaking news, announcements 

e Blogs — online diaries or journals 

e Charts — like polls, but more feature-rich; displayed in center column 

e Comments — user comments that can be appended to articles, Wiki pages, 
forum posts, and more 

e Cookies — taglines drawn randomly from tagline database 

e Directory — user-submitted Web links 

e Dynamic Content — snippets of text or code that can be incorporated by 
reference 

e Ephemerides — content that varies by date 

e FAQs — frequently asked questions and answers 

e Featured Links — external Web pages that open in an iframe 

e File Galleries — computer files and software for downloading 

e Forums — online discussions on a variety of topics 

e HTML Pages — static and dynamic HTML content 

e Image Galleries — collections of graphic images for viewing or downloading 

e Maps — navigable, interactive maps with user-selectable layers 

e Newsletters — content mailed to registered users 

e Polls — brief list of votable options; appears in module (left or right column) 

e Quizzes — timed questionnaire with recorded scores 

e RSS Feeds — newsfeeds from external Web sites 

e Surveys — online questionnaire 

e Trackers — facts and figures storage & retrieval, useful for to-do lists, 
inventories, or bug tracking 

e Wiki — collaboratively authored documents 


Content Organization Tools and Navigation Aids 
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e Calendar — show when content was created or modified 

e Categories — classify content according to subject descriptors 

e Content Templates — give a consistent look and feel to Wiki pages 

e Hotwords — automatically attach links to specified words or phrases 

e Modules — control appearance and content of boxes that appear in the left and 
right columns 

e MyTiki — provide content organization and communication tools for registered 
users 

e Search — provide full-text search capabilities 

e Structures — create hierarchically organized "breadcrumb" navigation aids for 
sets of Wiki pages 

e UserMenu — create custom menus to aid site navigation 

e Workflow — control routing of documents based on objectively defined actions 


Communication Tools 


e Chat — real-time text chatting 

e Communication Center — exchange data with other TikiWiki sites 

e Live Support — notify admin by e-mail when a user needs help 

e Mail-In — submit Wiki pages via e-mail 

e Messaging — enable users to send internal messages to each other 

e Mobile Tiki — make a TikiWiki site accessible to users of Web-enabled cell 
phones 

e Shoutbox — provide a "graffiti" box on the site's home page. 

e Tikibot — respond to data queries originated via IRC 

e Voice Tiki — provide voice-based browsing capability 

e Webmail — give users Web-based access to their POP3 e-mail accounts 


Configuration Tools and Administration Tools 


e Feature specific configuration of: articles, blogs, directory, FAQs, features, file 
and image galleries, forums, maps, polls, RSS feeds, trackers, webmail, and wiki 

e General configuration (set up, name, and configure the TikiWiki site) 

e Login configuration (how users register and log in) 

e Quota configuration for user files 

e Admin drawings — set up drawing tools for Wiki pages 

e Admin DSN — create links to external databases 

e Backups — make dumps of TikiWiki's SQL database 

e Banners — insert, track, and manage advertising banners 

e Banning — block access from individual IPs or ranges of IPs 

e Cache — control and flush cached data 

e Edit templates — edit SMARTY templates 

e External Wikis — enable direct links to external Wikis 
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e Groups — manage user groups 

e Import PHPWiki — import data from a PHPWiki site 

e Integrator — automatically import external HTML pages into the Wiki 

e Phpinfo — view PHP information on the server 

e QuickTags — define QuickTags for inserting Wiki syntax 

e Referrer Stats — view referrer stats 

e Search Stats — view search stats 

e Stats — view site stats 

e Theme control — assign different themes to various TikiWiki components 
e Users — manage registered users 


Internationalization 


TikiWiki is an international project, providing translations of the interface in several 
languages. The default interface language is English/en; but TikiWiki is designed to support any 
language encodable with UTF-8. As of 2005-09-29, TikiWiki is fully translated into eight 
languages and reportedly 90% or more translated into another five languages. There are also 
partial translations for nine other languages. A list of languages and current translation status is 
available on the TikiWiki website. 


Implementation 


TikiWiki is developed primarily in PHP, but has some JavaScript code. It makes extensive 
use of a database, being developed initially using MySQL, but now, via the ADOdb database 
abstraction library, has support for PostgreSQL, Oracle, Sybase, and Microsoft SQL Server. 
TikiWiki will run on any server, including Apache and Microsoft's IIS, that provides PHP 4.1 (or 
later) and one of the supported databases. See the TikiWiki website for specific requirements. 

Components of TikiWiki utilize other open source projects, including ADOdb, HawHaw, 
Graphviz, TouchGraph, phpCAS, FeedCreator, htmlArea, Overlib, PHP Layers Menu, 
JGraphPad, Morcego and Mapserver. 

If mapserver is used then TikiWiki becomes a Geospatial Content Management System with 
maps, location on maps of registered users, geographical images, geographical metadata and 
more. 


Project team 
TikiWiki is under active development by a large international community of over 300 
developers and translators. Project members have donated the resources and bandwidth required 


to host the extensive TikiWiki.org website, which exclusively utilizes TikiWiki. The project 
members refer to this dependence on their own product as "eating their own dogfood". 


History 
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The initial release of TikiWiki, version 0.9 (code named "Spica"), was in October 2002. It 
was primarily the development of Luis Argerich (Buenos Aires, Argentina), Eduardo Polidor 
(Sao Paulo, Brazil), and Garland Foster (Green Bay, WI, United States). Over 300 developers 
and translators have contributed to TikiWiki since then. The project has been hosted on 
SourceForge since its inception. 


Name 


The name 7ikiWiki is written in CamelCase, a common Wiki syntax indicating a hyperlink 
within the Wiki. It is most likely a compound word combining two Polynesian terms, Tiki and 
Wiki, to create a self-rhyming name, which together rhymes with wikiwiki, a common variant of 
wiki. 


List of wikis 


This is an incomplete list of wikis (sites based on the wiki model) of comparative notability. 
More exhaustive lists can be found on directory sites; see the external links section below. 


Wiki communities by topic 


Culture and the arts 


e A Million Penguins - a novel that was written collaboratively. 
IMSLP - a- wiki library of public domain music _ scores. 
Lostpedia - a wiki dedicated to the LOST television _ series. 
LyricWiki - a wiki listing lyrics by album. 
Memory Alpha - a Star Trek-related encyclopedia. 
Wookieepedia - a large Star Wars wiki/encyclopedia. 


ZineWiki - a zine encyclopedia. 
Directories 
e AboutUs.org - a wiki about Web sites. 
Geographic 


e City Wiki - a wiki used for geographically localized information 
Galbijim Wiki - a wiki devoted to expats in South Korea. 


Political 
e Conservapedia - an encyclopedia wiki with a Christian conservativism 


viewpoint 
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Wikocracy - a website where anyone can create hypothetical governments and 
laws or toy around with current ones. 


Religion 
e OrthodoxWiki 
Recreational 
e Uncyclopedia - a satirical parody-style encyclopedia. 
Sensei's Library - a site dealing with the ancient oriental game of Go. 
Wookieepedia - a website dedicated to the Star Wars universe. 
Reference 


Baidu Baike - Chinese web encyclopedia 

Citizendium - an online wiki-based encyclopedia that does not allow 
anonymous editing and has an "editor" role for specialists in particular subjects. 

Enciclopedia Libre Universal en Espaniol - a Spanish language fork of 
Wikipedia. 

Scholarpedia- A small wiki, with less than 100 articles, that is written by 
professionals in their field. For example, an article on neurons would be written 
by a famous neurosuregeon. 

Susning.nu - Swedish language encyclopedia, dictionary, and discussion forum. 
A competitor with the Swedish Wikipedia. 

Wikibooks - textbooks; sister project of Wikipedia. 

Wikinews - A collaborative news service; sister project of Wikipedia. 

Wikinfo - a fork of Wikipedia, inspired by Wikipedia, but without the neutral 
point of view policy, instead using a sympathetic point of view. 

Wikipedia - a general encyclopedia covering a wide range topics. 

Wikiquote - a quote repository. A sister project of Wikipedia. 

Wikisource - Primary sources for use in Wikipedia, a sister project of 
Wikipedia. 

Wikispecies - a directory of species. A sister project of Wikipedia. 
WikiTimeScale - an interactive graphical illustration of history. 

Wikiversity - an online learning program; a sister project of Wikipedia. 
WikiZnanie - a Russian language WikiWiki encyclopedia released under the 
BSL DPL license. 

Wiktionary - a multilingual Dictionary project; a sister project of Wikipedia. 
Wikia - (formerly Wikicities) A wiki farm hosting wikis for various online 
communities. 

Wikitravel - comprehensive travel guide website; using the MediaWiki 
software. 
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Science and technology 


Computers 


Societal 


Travel 


Dispersive PDE Wiki - A wiki concerned with the well-posedness of various 
non-linear dispersive and wave equations. 

OpenWetWare - a wiki that promotes sharing and dissemination of knowledge 
related to biological research. 

Palaeos.org - a paleontology and history of earth knowledge base. 


MeatballWiki - discusses online communities ("meatballs") 

OSBIWiki - a wiki focused on Open Source Business Intelligence. 
UseModWiki - first wiki running UseModWiki WikiEngine and originally 
dedicated to the "Usenet Moderation Project" 

WikiWikiWeb - Ward Cunningham's wiki, the world's oldest (March 1995). 
Devoted to computer programming. 

WikiServer - a wiki focused on the development of a self-contained wiki- 
engine. 

Javapedia - a wiki focussed on developing an online encyclopedia of Java 
Platform. 


Congresspedia, a separately branded sub-section of SourceWatch - an 
encyclopedia about the U.S. Congress. 

Intellipedia - a classified wiki that runs on the top-secret network that links the 
U.S. intelligence community. It is not accessible to the public. 

Jurispedia - an academic encyclopedia about Law. At present, it exists in Arabic, 
Chinese, English, French, German and Spanish editions. 

MemoryArchive (formerly MemoryWiki) - A place to collect people's personal 
accounts of historical events. 

SourceWatch (formerly Disinfopedia) - a wiki about propaganda, including 
organizations that seek to influence public opinion. 

The Student Room - a student resource wiki offering revision notes, examples 
of university personal statements and information on student life, UK 
universities and courses. 

Wikileaks - a wiki that allows people to leak official government documents 
anonymously. 

WikiTree[1] - a wiki devoted to geneaology 

Wikocracy - allows users to re-write the Constitution of the United States of 
America, the USA PATRIOT Act, Roe v. Wade, the Digital Millennium Copyright 
Act or any law they feel needs changing 
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WikiMapia - online map resource that combines Google Maps with a wiki 
system. 35 languages currently supported. 

Wikitravel - a travel guide, organized as a wiki, inspired in part by Wikipedia, 
but not part of Wikimedia. 

World66 - a travel-based wiki. 


Alphabetical list of wiki communities 


BOWiki - a collaborative ontology curation framework and gene function editor 
Conservapedia - a conservative encyclopedia 

Enciclopedia Libre Universal en Espafiol - a Spanish language fork of 
Wikipedia. 

IMSLP - a wiki library of public domain music scores 

Intellipedia, a classified wiki that runs on a top-secret network that links the 
U.S. intelligence community. It is not accessible to the public. 

Javapedia - a wiki focussed on developing an online encyclopedia of Java 
Platform. 

Jurispedia - an academic encyclopedia about Law in several languages 
Lostpedia- A wiki dedicated to all things "Lost" TV show related. 
MeatballWiki - discusses online communities ("meatballs") 

Memory Alpha - a Star Trek-related encyclopedia. 

OpenWetWare - a wiki that promotes sharing and dissemination of knowledge 
related to biological research. 

OSBIWiki - a wiki focusing on open source business intelligence software. 
PhpWiki - a wiki devoted to PHP software. 

SourceWatch (formerly Disinfopedia) - a wiki about propaganda, including 
organizations that seek to influence public opinion.n 

Susning.nu - Swedish language encyclopedia, dictionary, and discussion forum. 
A competitor with the Swedish Wikipedia. 

The Student Room - contains a student resource wiki offering revision notes, 
example university personal statements, and information on UK universities 
and courses. 

TWiki - community wiki about wikis in general, and to develop, document, 
package and support the TWiki enterprise collaboration platform 
Uncyclopedia - a satirical parody of Wikipedia 

UseModWiki - first wiki running UseModWiki WikiEngine and originally 
dedicated to the "Usenet Moderation Project" 

Wikibooks - textbooks; sister project of Wikipedia. 

Wikihow - a wiki that explains how to do things. 

WikiMapia - a wiki using Google Maps. Users can pinpoint city landmarks. 
Wikinews - a collabrative news service; sister project of Wikipedia 

Wikinfo - a fork of Wikipedia, inspired by Wikipedia, but without the neutral 
point of view policy, instead using a sympathetic point of view. 

Wikipedia - the world's largest Encyclopedia. 
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Wikiquote - a quote repository. A sister project of Wikipedia. 

Wikisource - Primary sources for use in Wikipedia, a sister project of 
Wikipedia. 

WikiTimeScale - an interactive graphical illustration of history. 

Wikitravel - a multi-language travel guide, organized as a wiki, inspired in part 
by Wikipedia. 

Wikitruth - a wiki dedicated to pointing out "atrocities" in Wikipedia 
WikiWikiWeb - Ward Cunningham's programming wiki, the world's oldest 
(March 1995). 

WikiZnanie - a Russian language encyclopedia. 

Wiktionary - a multilingual Dictionary project; a sister project of Wikipedia. 
Wookieepedia - a Star Wars encyclopedia 

World66 - a travel-based wiki. 


Wiki-like systems 


e Bliki - "wikilog," a blog system organized as a wiki. It includes SnipSnap. 
Everything2 - a wikilike project for essays. 
PlanetMath - a free Wiki-style mathematical encyclopedia. 
TWiki - Wiki-like software sometimes used by software companies 


See also 


e Wiki software 


Zope 


Zope is an open-source, object-oriented web application server written in Python 
programming language. Zope stands for "Z Object Publishing Environment." It can be almost 
fully managed with a web-based user interface. Zope publishes on the web Python objects that 
are typically persisted in an object database, ZODB. Basic object types, such as documents, 
images, and page templates, are available for the user to create and manage through the web. 
Specialized object types, such as wikis, blogs, and photo galleries, are available as third-party 
add-ons (called products), and there is a thriving community of small businesses creating custom 
web applications as Zope products. 

There are two major generations of the software in use today. As of May 2007, Zope 2.10.3 
is the latest stable release of Zope 2 codebase, and Zope 3.3.1 is the latest release of Zope 3. 
Zope is distributed under the terms of the Zope Public License, a free software license. 


History 


The Zope Corporation was formed in 1995 in Fredericksburg, VA, as a joint venture with 
InfiNet (a joint newspaper chain venture). The company developed a classified advertisement 
engine for the Internet, then in 1997 became an independently owned private company. At the 
behest of its largest investor, Opticality Ventures, Zope Corporation released its software as free 
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software, concentrating on providing premier customization services. The company's software 
engineers are led by CTO Jim Fulton. PythonLabs, creators of Python, (foundation of Zope), 
became part of the company in the year 2000. Python founder Guido van Rossum is no longer 
employed by Zope, however. 

What is now known as Zope 2 started off with the merger of three separate software products 
— Bobo, Document Template, and BoboPOS — into the Principia application server. The 
authoring company, Digital Creations, open-sourced their flagship product in 1998, influenced 
by their venture capitalist, Hadar Pedhazur. This product was renamed "Zope" with this release. 

In November 2004, Zope 3 was released. Zope 3 is a complete rewrite that only preserves the 
original ZODB object database. The design of Zope 3 is driven by the needs of large companies, 
the clients of Zope Corporation, for complex business schemas. It is directly intended for 
enterprise web application development using the newest development paradigms. Zope 3 is, 
however, not compatible with Zope 2, so you cannot run Zope 2 applications on Zope 3. 
Therefore, an effort (called Five) to merge the new Zope 3 paradigms into Zope 2 has been 
ongoing since Zope 2.8. Its aim is to allow developers to write applications and libraries that run 
under both Zope 3 and Zope 2. 


Technical features 


A Zope website is composed of objects in an object database as opposed to files, as is usual 
with many other web server systems. This approach allows users to harness the advantages of 
object technologies, such as encapsulation. Zope maps URLs to objects using the containment 
hierarchy of such objects; methods are considered to be contained in their objects as well. 

Zope comes with the Zope Object Database, which transparently persists (Python) objects in 
a transactional database. 

One particularly innovative feature of Zope is its widespread use of acquisition. Acquisition 
is a technique parallel to class inheritance, in which objects "inherit" behavior from their context 
in a composition hierarchy, as opposed to their class in a class hierarchy. This allows certain 
ways to structure source code that are otherwise harder to accomplish, and can encourage 
application decomposition. A common use is in structuring the way layout elements are used in a 
web page. Acquisition as implemented in Zope 2 is also perceived as a source of bugs, as it 
enables many unexpected behaviours. The use of acquisition has been severely narrowed in Zope 
3. 

Zope provides several mechanisms for HTML templating: Dynamic Template Markup 
Language (DTML), and Zope Page Templates (ZPT). DTML is a tag-based language which 
allows implementation of simple scripting in the templates. DTML has provisions for variable 
inclusion, conditions, and loops. However, DTML has major drawbacks: DTML tags 
interspersed with HTML form non-valid HTML documents, and careless inclusion of logic into 
templates results in very unreadable code. ZPT is a technology that fixes these shortcomings. 
ZPT templates can be either well-formed XML documents or HTML documents, in which all 
special markup is presented as attributes in the TAL (Template Attribute Language) namespace. 
ZPT offers just a very limited set of tools for conditional inclusion and repetition of XML 
elements, thus the templates are usually quite simple, with most logic implemented in Python 
code. One significant advantage of ZPT templates is that they can be edited in most graphical 
HTML editors. ZPT also offers direct support for internationalization. 
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Zope 2 is the base behind the Nuxeo CPS, Plone, and Silva content management systems, as 
well as the base behind ERP5 open source enterprise resource planning. However, in late 2006 
Nuxeo announced that future versions of its CMS would be implemented in Java. 


Zope 3 


A new version of Zope, Zope 3, has been in development for some years. Zope 2 has proven 
itself as a useful framework for web applications development, but its use revealed some 
shortcomings. To name a few, creating Zope 2 products involves copying a lot of boilerplate 
code — "magic" code — that just has to be there, and the built-in management interface is difficult 
to modify or replace. Zope 3 is a rewrite of the software that attempts to address these 
shortcomings while retaining the advantages of Zope that led to its popularity. Zope 3 is based 
on a component architecture that makes it easy to mix software components of various origins 
written in Python. The first production release of the new software, Zope X3 3.0.0, was released 
on November 6, 2004. The current production release, Zope 3.3.1, was released on January 14, 
2007. 


Five and the Future of Zope 


The Five project is a Zope 2 extension that allows usage of Zope 3 technologies within Zope 
2 

Originally released separately, Five has become an integral part of the main line of Zope 2 
development. Five 1.0 was integrated with Zope 2 starting with Zope 2.8.0. Each successive 
Zope 2 release has integrated more of the features of Zope 3. 

The name Five is a bit of a joke: Zope 2 + Zope 3 = Five. 

In one likely future scenario, proposed by Jim Fulton, Zope 3 may re-focus on providing 
multiple independent infrastructure components, while Zope 2 and Five would become one 
project focused on providing a web application server framework (an ancillary goal is to reduce 
confusion about what exactly "Zope" is). 


Zope Page Templates 


As mentioned previously, Zope Page Templates are themselves XHTML documents which 
means they can be viewed and edited using XHTML compliant tools (which is a big benefit 
compared to other template languages used for web applications). However, these page templates 
are not meant to be rendered as is. Instead they are marked up with additional elements and 
attributes in special XML namespaces (see below). This additional information is used to 
describe how the page template should ultimately be processed. 

Here are some basic examples. To conditionally include a particular element, like a div 
element, simply add the tal:condition attribute to the element as follows: 


<div tal:condition="...">... </div> 


To control what appears inside an element, use the tal:content attribute like this: 
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<h1><span tal:content="..."/></h1> ... 


Finally, to introduce or replace values of attributes use the tal:attributes attribute as follows: 
The power of python could also be utilised to dynamically alter the href at runtime. 


<a href="" tal:attributes="python: ‘href http://someurl.com'">...</a> 


Because Zope Page Templates are themselves XHTML not only can they be edited using 
normal HTML editors but they can also be checked for XHTML compliance in template form. 
As a result, you can be fairly confident that these templates will be expanded into proper 
XHTML automatically. 

This a very cursory explanation of Zope Page Templates. The behavior of Zope Page 
Templates is almost completely described by a template language, fixed on TAL, TALES, and 
METAL specifications: 

e Template Attribute Language (TAL), 
e Template Attribute Language Expression Syntax (TALES), 
e Macro Expansion Template Attribute Language (METAL). 


List of content management systems 
This is a list of notable content management systems that are used to organize and 


facilitate collaborative content creation. Many of them are built on top of separate content 
management frameworks. 


Free and open source software 


This section lists free software and open source software. 


Latest 
Name Platform Supported databases stable License 
release 
MySQL, Oracle, SQL Server, 
Alfresco Java PostgreSQL, Informix 2 GPL 
pace teava Java, XML, Apache 12.4 Apache 
Cocoon License 
BLOG:CMS PHP MySQL GPL 
blosxom Perl Flat-file database 2.0 MIT 
CMSimple PHP Flat-file database 29) Affero 
CorneliOS Perl MyoQE. “ane any: “Per DBlo a6. Gel. 
filesystem 
Cyclone3 eae dayascnpl: MySQL and any Perl DBI 3.0 GPL 
. Java, XML, Apache Apache 
Paiey. Cocoon MySQL zine License 
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Dokuwiki 


DotNetNuke 
Drupal 

e107 

eZ publish 


Fedora 


jAPS - java Agile Java, 


Portal System 
Joomla! 
KnowledgeTree 
Document 
Management 
System 
Lyceum 
Mambo 
MediaWiki 


Midgard CMS 


MMBase 
MODx CMS 
Nucleus CMS 
Nuxeo CPS 
OpenACS 
OpenCms 
OpenPortals 
phpCMS 
PHP-Fusion 
PHP-Nuke 
phpWCMS 
phpWebSite 


Php Wiki 


Plone 

PmWiki 
PostNuke 
Quick.Cms.Lite 


Scoop 
SilverStripe 


SiteFrame 
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; 2006- 

PHP Flat-file database 11-06 
ASP.NET Microsoft SQL Server 4.5.3 
PHP MySQL/PostgreSQL 52 
PHP MySQL 0.7.8 
PHP (PHP4 Only) MySQL/Postgresql/Oracle 392 
Java MySQL or Oracle 2d 
Win ae oa HyperSonic SQL, PostgreSQL 
PHP MySQL 1.0.13 
PHP MySQL eRe | 
PHP MySQL 
PHP MySQL 4.6.2 
PHP MySQL, PostgreSQL 1.10.1 
PHP (Midgard 
framework) Myr 
Java MySQL/PostgreSQL 
PHP MySQL 0.9.6 
PHP MySQL 3:23 
Zope ZODB 3.4.3 
TCL AOLserver PostgreSQL/Oracle S25 
Java MySQL, Oracle 7.0.0 
PHP MySQL 4.0.3 
PHP Flat-file database L228 
PHP MySQL 6.01.11 
PHP MySQL 8.0 
PHP MySQL 1.3.3 
PHP MySQL or PostgreSQL 1.1.0 

Flat-file 
PHP database/MySQL/PostgreSQL 

etc. 

ZODB, SQLite, PostgreSQL, 
Ae Eon MySQL, Oracle via Zope. ae 
PHP Flat-file database 
PHP MySQL .764 
PHP Flat_file 1.2 
Perl on mod_perl MySQL 1.1.8 
PHP MySQL 2.0.0 
PHP + Smarty MySQL 5.0.2 
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GPL 


BSD 

GPL 

GPL 

GPL 
Educational 
Community 
License 


GPL 
GPL 


KPL 
(custom) 


GPL 
GPL 
GPL 


GPL 


MPL 
GPL 
GPL 
GPL 
GPL 
LGPL 
GPL 
GPL 
GPL 
GPL 
GPL 
LGPL 


GPL 


GPL 


GPL 

GPL 

GPL 

GPL 

BSD 
Creative 
Commons 


Slash 

SPIP 
Textpattern 
TikiWiki 
TWiki 
Typo 
TYPO3 
WebGUI 
WordPress 


Xaraya 
XOOPS 


Name 


Freeware 


Name 


CivicSpace PHP 


Movable 
Type 


Name 


Community 
Server 


Expanse (CMS) PHP 


WebHat 
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Perl on mod_perl MySQL GPL 
PHP MySQL 1.9.2 GPL 
PHP MySQL 4.0.55 GPL 
PHP ADOdb 1.9.7. LGPL 
Perl Perl DBI compatible 4.0.4 GPL 
Ruby on Rails MySQL, PostgreSQL, SQLite MIT 
PHP MySQL, PostgreSQL, Oracle 4.1.2 GPL 
Perl on mod_perl MySQL GPL 
PHP MySQL 2.2.2 GPL 
PHP th MYSQL, PostgreSQL, SQLite 
using ADOdb and Microsoftl.1.3 GPL 
an SQL Server with Creole 
PHP MySQL 2.2 GPL 
Latest 
Platform Supported databases stable License 
release 
This section lists freeware. 
Platform Supported databases Eales! Seaple Online 
release Demo 
MySQL 0.8.3 
MySQL or MS SQL server or Oracle OF 2 35 es 
PostgreSQL or SQLite 
Commercial, low cost (< $5,000) 
Latest stable Price in Online 
Platform Supported databases seleacé USD Deno 
ASP.NETSQL Server 2007 $300 Yes 
MySQL Unknown $29.99 Yes 
pap. “POE, Oracles 80h 399 from $1000 Yes 
Server 


Commercial, medium cost ($5,000 - $15,000) 


Supported Latest stable Price in Online 
pany Piattorm databases release USD Demo 
ais icMS States ava 5.6.1 Yes 
Edition 
Simplicis JSP Any SQL-92 3.0.2 $5,999/year Yes 

. sai $5,000 and 
Traction TeamPage Java Built-in J: i 
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Commercial, high cost (> $15,000) 


Nate Platform Supported Latest stables, ice in USD Online 
databases release Demo 
Oracle, IBM DB2, CoreMedia 
CoreMedia CMS Java Microsoft SQL CMS 2006 
Server 
etaenaa Oracle, Microsoft 
(owned by EMC) Java SQL Server, IBM5.3sp4 
DB2 
starts at $80,000, 
EM3 i10n Java Racose mE A multi-site, Yes 
Server, Oracle eae 
unlimited users 
Oracle, Microsoft 
; SQL Server, IBMContent 
pal ite Java DB2, Sybase, Server 7.0.2 a 
MySQL 
FileNet (owned java ae Aang Content 
by IBM) ‘ Manager 4.0 
DB2 
Immediacy .NET Bucroy ol, nO 
Server 
PHP andMS SQL Server, 
Jadu NET MySQL 2.0x 
Starts at €9,000; 
Jalios JCMS Java/J2EE Se] JCMS Enterprise Yes 
Edition is €75,000 
oo a8 Oracle Database or price per named 
Livelink ECM J2EE MS SOL Server 9.7 eke Yes 
. : Microsoft SQL 
ee pens NET Server (2000 or 2007 Yes 
2005), SQL Express 
RedDot CMS 


(owned by Open Windows 
Text) 


MS SQL _ Server, 


Oracle Database do 


Oracle database or 


price per CPU or 


Yes 
named users 


Rhythmyx XML, J2EE MS SQL Server 6.0 No 
Socialtext Perl 
Stellent wned . Oracle, SQL Server, 
by Oracle) IDocScript, ee vas) 
XML 

. Oracle Database, MS 
i aceoceroccanacian | SQL Server, IBM7.3.1 
Management DB2 
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All supported by 421 


ee a Hibernate 


Unclear licensing conditions 


The vendors of this software do not disclose prices or licensing details, requesting to contact 
for this information individually 


Name Platform Supported databases Latest stable release 
Aegir Midgard add-on 1.0.3 

Ariadne Oracle, PostgreSQL 

Doop (CMS) PHP Flat-file database i335 

See also 


e List of web application frameworks 

e OSCOM, the central organization for open source content management, provides 
many resources on open source 

e Enterprise Content Management 


List of content management frameworks 


A content management framework is an application programming interface for creating a 
customized content management system. 
Some such systems are: 


Name Technologies 

Apache Cocoon Java 

AxKit Perl 

Catalyst Perl 

CherryPy Python 

Drupal PHP 

eZ publish PHP and MySQL 

Jakarta Slide Java 

Joomla! PHP and MySQL 

Mambo PHP and MySQL 

Maypole Perl 

Midgard (also in a "Lite" edition) PHP and MySQL 

MODx CMS and Application Framework PHP 4.1.x-5 and MySQL 3.2x-5 
Monk CMS PHP and MySQL 

OpenACS AOLserver and PostgreSQL or Oracle 
OpenPortals PHP and MySQL 

phpXCore PHP and MySQL 

Pier Smalltalk and Seaside 

Plone Python 

RIFE Java 
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Seagull PHP 4/5, works with MySQL, Oracle or PostgreSQL 
Slither 

TribalCMS PHP, MySQL and AJAX 

TYPO3 PHP and MySQL 

Virtual-OS PHP and MySQL 

Xaraya PHP 

Xoops PHP and MySQL 

See also 


e List of content management systems 


Document management system 


A document management system (DMS) is a computer system (or set of computer 
programs) used to track and store electronic documents and/or images of paper documents. The 
term has some overlap with the concepts of Content Management Systems and is often viewed as 
a component of Enterprise Content Management Systems and related to Digital Asset 
Management, Document imaging, Workflow systems and Records Management systems. 


Overview 


A document management system will typically address some or all of the following areas: 
Location and Time 
Typically via a built in search engine. Some also allow documents to 
be retrieved using metadata (date, time, tags, document type, etc) 
Filing Organization? Strategy? 
Protection against loss, tampering or destruction of documents? How 


Retrieval 


Securit ; ae ; 
y to deal with sensitive information? 
, Readability? How can we protect our documents against fires, floods 
Archival : 
or natural disasters? 
Retention What to retain? Length of retention? Removal? 
Distribution People? Cost of distribution? 
If documents need to pass from one person to another, what are the 
Workflow 


rules for how their work should flow? 
Creation Number of people and logistics of collaboration? 
How do we provide needed requirements for legal submission to 
Authentication/Approval government and private industry that the documents are original and 
meet their standards for authentication? 


History 


Beginning in the 1980s, a number of vendors began developing systems to manage paper- 
based documents. Initially designed to offer mainly document imaging-level capture, storage, 
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indexing and retrieval capabilities, the applications grew to encompass electronic documents, 
collaboration tools, security, and auditing capabilities... 


Document Management and Communication 


Electronic document management is in particular worked out by Carzaniga and Wolf (2001) 
in their paper “Content-based networking: a new communication infrastructure”. The authors 
introduce content-based networking as a communication infrastructure where information is 
driven by the content throughout the network. The users express their interests, and the senders 
simply input the message into the network. From that point the network delivers all the 
information to the right people. Sprague (1995) delivers a more elaborate work in which he 
introduces document management through using IT. He calls it electronic document 
management: EDM. He defines managing of documents as the “creation, storage, organization, 
transmission, retrieval, manipulation, update, and eventual disposition of documents to fulfill an 
organizational purpose” (pp.32), and he further states that EDM improves communication among 
people and groups of people (pp 42-43). 

There are several other examples from the literature for the link between EDM and 
communication. Hansen and Haas (2001) elaborate on the role of the suppliers and users of 
information in electronic documents. Another research with a very clear link between EDMS and 
communication is that of Thorpe and Mead (2000). They showed that an EDM system changes 
the communication patterns. Of the three case projects they researched, EDM acquired a central 
role in two of them, (the third project was abandoned after three months). A research of Howard 
and Pettersen (2001) about the way of communicating in a construction project had as result that 
EDM (Howard and Pettersen call it project web) was number three communication tool just after 
telephone and a meeting, leaving e-mail, paper-post and fax behind. Rene Brohm (2005) 
introduced in his dissertation the theater model. The theater model illustrates methaphorically 
how document management systems correspond with a stage in a theater. His argumentation is 
that the interaction in a play on the stage is similar with the functioning of a document system. 

If all the data and information would be put in a central database/intranet, which can be used 
by everyone in the organization, there would be a clear link between IT and dissemination of 
information according to Marin & Poulter (2004). They argue that because of the easy access to 
the information, it would flow through the organization. The authors confirm this in their paper 
(2004) by stating that distribution of intelligence can be aided by technology. 

There are different ways of improving this communication tool. Hansen and Haas (2001) see 
the electronic document management as a market, with competition. According to them suppliers 
should have a strategy about how to share information and how to persuade their clients 
(employees) to use the system? 

One way to do this is introduced by Yan & Garcia-Molina (1999 pp.2) who use EDM to: 
“make long term profile consisting of a number of standing queries to represent his information 
needs”. Through this they state that dissemination of information is improved. Users receive 
information in their field of interest because of a profile that was submitted. Therefore search 
costs and search time for employees are decreased. 


Components 
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Document management systems commonly provide storage, versioning, metadata, security, 
as well as indexing and retrieval capabilities. Here is a description of these components. 
Metadata 

Metadata is typically stored for each document. Metadata may, for example, include the date 
the document was stored and the identity of the user storing it. The DMS may also extract 
metadata from the document automatically or prompt the user to add metadata. Some systems 
also use optical character recognition on scanned images, or perform text extraction on electronic 
documents. The resulting extracted text can be used to assist users in locating documents by 
identifying probable keywords or providing for full text search capability, or can be used on its 
own. Extracted text can also be stored as a component of metadata, stored with the image, or 
separately as a source for searching document collections. 

Integration 

Many document management systems attempt to integrate document management directly 
into other applications, so that users may retrieve existing documents directly from the document 
management system repository, make changes, and save the changed document back to the 
repository as a new version, all without leaving the application. Such integration is commonly 
available for office suites and e-mail or collaboration/groupware software. Integration often uses 
open standards such as ODMA, LDAP, WebDAV and SOAP to allow integration with other 
software and compliance with internal controls. 

Capture 

Images of paper documents using scanners or multifunction printers. Optical Character 
Recognition (OCR) software is often used, whether integrated into the hardware or as stand- 
alone software, in order to convert digital images into machine readable text. 

Indexing 

Track electronic documents. Indexing may be as simple as keeping track of unique document 
identifiers; but often it takes a more complex form, providing classification through the 
documents' metadata or even through word indexes extracted from the documents' contents. 
Indexing exists mainly to support retrieval. One area of critical importance for rapid retrieval is 
the creation of an index topology. 

Storage 

Store electronic documents. Storage of the documents often includes management of those 
same documents; where they are stored, for how long, migration of the documents from one 
storage media to another (Hierarchical storage management) and eventual document destruction. 
Retrieval 

Retrieve the electronic documents from the storage. Although the notion of retrieving a 
particular document is simple, retrieval in the electronic context can be quite complex and 
powerful. Simple retrieval of individual documents can be supported by allowing the user to 
specify the unique document identifier, and having the system use the basic index (or a non- 
indexed query on its data store) to retrieve the document. More flexible retrieval allows the user 
to specify partial search terms involving the document identifier and/or parts of the expected 
metadata. This would typically return a list of documents which match the user's search terms. 
Some systems provide the capability to specify a Boolean expression containing multiple 
keywords or example phrases expected to exist within the documents’ contents. The retrieval for 
this kind of query may be supported by previously-built indexes, or may perform more time- 
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consuming searches through the documents' contents to return a list of the potentially relevant 
documents. 

Distribution 

Security 

Workflow 

Collaboration 

Versioning 


Enterprise content management 


Enterprise Content Management (ECM) is any of the strategies and technologies 
employed in the information technology industry for managing the capture, storage, security, 
revision control, retrieval, distribution, preservation and destruction of documents and content. 

ECM especially concerns content imported into or generated from within an organization in 
the course of its operation, and includes the control of access to this content from outside of the 
organization's processes. 


Purpose 


ECM systems are designed to manage both structured and unstructured content, so that an 
organization, such as a business or governmental agency, can more effectively meet business 
goals (increase profit or improve the efficient use of budgets), serve its customers (as a 
competitive advantage, or to improve responsiveness), and protect itself (against non- 
compliance, law-suits, uncoordinated departments or turnover within the organization). In a large 
enterprise, ECM is not regarded as an optional expense, where it is essential to content 
preservation and re-usability, and to the control of access to content - whereas, very small 
organizations may find their needs temporarily met by carefully managed shared folders and a 
wiki, for example. 

Recent trends in business and government indicate that ECM is becoming a core investment 
for organizations of all sizes, more immediately tied to organizational goals than in the past: 
increasingly more central to what an enterprise does, and how it accomplishes its mission "!. 


Definition 


The "official" definition of enterprise content management was created by AIIM 
international, the worldwide association for enterprise content management in the year 2000. The 
abbreviation ECM has been reinterpreted and redefined many times during the past years, 
replacing words like "create" or "customize" that were originally part of it”. 

Tn autumn 2005 AIIM defined ECM as follows: 

Enterprise Content Management is the technologies used to Capture, Manage, Store, 

Preserve, and Deliver content and documents related to organizational processes. 

In winter 2006 AIIM added the following paragraph to the definition: 

ECM tools and strategies allow the management of an organization's unstructured 

information, wherever that information exists."! 
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This new term is intended to completely encompass the legacy problem domains that have 
traditionally been addressed by records management and document management. It also includes 
all of the additional problems involved in converting to and from digital content, to and from the 
traditional media of those problem domains (such as physical and computerized filing and 
retrieval systems, often involving paper and microforms). Finally ECM is a new problem domain 
in its own right, as it has employed the technologies and strategies of (digital) content 
management to address business process issues, such as records and auditing, knowledge 
sharing, personalization and standardization of content, and so on. 

New product suites have arisen from the combination of capture, search and networking 
capabilities with technologies of the content management field, which have traditionally 
addressed digital archiving, document management and workflow. Generally speaking, this is 
when content management becomes enterprise content management. The different nomenclature 
is intended to encompass all of the problem areas related to the use and preservation of 
information within an organization, in all of its forms - not just its web-oriented face to the 
outside world. Therefore, most solutions focus on "business to employee" (B2E) systems. 
However, as the solutions have evolved, new components to content management have arisen. 
For example, as unstructured content is checked in and out of an ECM system, each use can 
potentially enrich the content's profile, to some extent automatically, so that the system might 
gradually acquire or "learn" new filtering, routing and search pathways, corporate taxonomies 
and semantic networks, which in turn assist in making better retention-rule decisions, 
determining which records or documents to keep, and which to discard, and when. Such issues 
become all the more important, as email and instant messaging are increasingly employed in the 
decision-making processes in an organization. 

Thus, the term enterprise content management refers to solutions that concentrate on 
providing in-house information, usually using internet technologies. The solutions tend to 
provide intranet services to employees (B2E), but also include enterprise portals for "business to 
business" (B2B), "business to government" (B2G), or "government to business" (G2B), etc. This 
category includes most of the former document management groupware and workflow solutions 
that have not yet fully converted their architecture, but provide a web interface to their 
applications. Digital Asset Management (DAM) is as well a form of ECM that is concerned with 
content stored using digital electronic technology. 


History 


The technology components that comprise ECM today are the descendants of the electronic 
document management systems (EDMS) software products that were first released in the late 
1980s and early 1990s. The original EDMS products were developed as_ stand-alone 
technologies, and these products provided functionality in one of four areas: imaging, workflow, 
document management, or COLD/ERM (see "Components of an enterprise content management 
system," below). 

For the software companies, it made sense to develop different products for each of these 
distinct EDMS functions. At that time, most organizations that were candidates for EDMS 
generally wanted a solution to address just one overriding business need or application. They 
were looking for stand-alone solutions to address narrow application needs, many of them at the 
departmental level — such as imaging for forms processing, workflow for insurance claims 
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processing, document management for engineering documentation, or COLD/ERM for 
distributing and archiving monthly financial reports. 

The typical "early adopter" of these new technologies was an organization that deployed a 
small-scale imaging and workflow system, possibly to just a single department, in order to 
improve the efficiency of a repetitive, paper-intensive business process and migrate towards the 
Paperless office. Even in these early years, when the market for these software products was still 
relatively immature, it was clear that each of the major technologies within EDMS offered 
tremendous value to specific organizational processes or applications, at a time when business 
processes were overwhelmingly paper-based. The primary benefits that the first stand-alone 
EDMS technologies brought to organizations revolved around saving time or improving 
accessibility to information. Among the specific benefits were the following: 

e Reduction of paper handling and error-prone manual processes 

e Reduction of paper storage 

e Reduction of lost documents 

e Faster access to information 

e Online access to information that was formerly available only on paper, 
microfilm, or microfiche 

e Improved control over documents and document-oriented processes 

e Streamlining of time-consuming business processes 

e Security over document access and modification 

e Provide reliable and accurate audit trail 

e Improved tracking and monitoring, with the ability to identify bottlenecks and 
modify the system to improve efficiency 

Through the late 1990s, the various segments of the EDMS industry continued to grow 
steadily, if not spectacularly. The technologies appealed to organizations with clear problems, 
and which needed targeted, tactical solutions to address those problems. 

As time passed, and more organizations had achieved "pockets" of productivity with the use 
of these technologies, it became clear that the various EDMS product categories were in fact 
complementary for many businesses. Organizations increasingly wanted to be able to leverage 
the capabilities of multiple EDMS products. Consider, for example, the needs of a customer 
service department, where imaging, document management, and workflow functionality could be 
brought together to allow agents to access any information needed to resolve a customer inquiry. 
Likewise, an accounting department could access supplier invoices from a COLD/ERM system, 
purchase orders from an imaging system, and contracts from a document management system as 
part of an approval workflow. And as more and more organizations established an Internet 
presence, they wanted to present certain of this information via the web, which required the 
capabilities to manage web content. Furthermore, organizations that had installed the software in 
individual departments now began to envision wider benefits, if they were to deploy it across the 
enterprise. Consider the fact that many business documents cross multiple departments and 
multiple business processes. Why not improve the management of electronic documents 
throughout the organization, and gain the same business benefits at an enterprise level? 

Both the market and the software providers began to understand the strategic potential of 
software products that integrated the individual EDMS technology components into a single, 
integrated solution, capable of addressing an organization's complete information management 
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needs. In fact, the movement toward integrated EDMS solutions merely reflected a common 
trend in the history of the software industry: the obsolescence of certain types of products and the 
convergence of technologies, as vendors melded them into new packages. 

Consider office suites, for instance. In the 1970s and early 1980s, word processing, 
spreadsheet, and presentation software products were standalone products. Within an 
organization, however, the same users were likely to need all three products. The software 
vendors responded, and started packaging them as integrated office suites — a strategy that also 
helped address consumer demand for tighter interoperability among desktop applications. 

The situation was similar in the EDMS world. Just about any company that needed document 
management also needed imaging, workflow, web content management, and COLD/ERM. 
Organizations began to demand multiple EDMS services and ways to leverage them for broad- 
based applications. Thus, the EDMS vendors took steps to deliver on truly integrated solutions 
incorporating the EDMS component technologies. 

The leaders tended to be those vendors that already offered multiple stand-alone EDMS 
technologies. For these vendors, the early steps toward consolidation were small ones. The first 
phase was to offer multiple systems as a single, packaged "suite." Early suites were little more 
than multiple products being sold together at a reduced price, and there was a perception in the 
market that such suites were a strategy on the part of the vendors to capture additional seats 
within a customer account. Not surprisingly, market acceptance was limited — at least initially. 

But in the late 1990s, these software vendors began a major surge of software development 
and acquisition activity, adding capabilities to their software products or buying the software 
companies whose products offered the functional capabilities they needed. Integrating the 
products into a single solution has proven to be an ongoing challenge for many of these vendors. 
Scalability — that is, the ability of a software product to continue to function well when it is 
deployed on a wide scale — also presented some significant problems, as organizations demanded 
solutions that could be deployed not just to multiple geographic locations, but on a global scale, 
to tens of thousands of users. 

In response to these market demands, the major software providers put considerable 
development effort into addressing these issues, and they continue to enhance the capabilities of 
their products and to expand the types of content those products can manage. Beginning in 
approximately 2001, the industry began to use the term "enterprise content management" to refer 
to those software solutions that provide the full complement of EDMS technologies, reflecting 
the truly "enterprise" nature of their products. 

More recently, the ECM market has seen the entry of Microsoft and Oracle Corporation, two 
of the largest and most pervasive providers of software, at the value end of the market “!. These 
companies have each taken steps to develop solutions for content management — Microsoft with 
its various offerings in the SharePoint product family in recent years, and Oracle in 2006 with its 
Oracle Content Management product. These two software companies look to provide software 
solutions with the basic ECM functionality that will address the functional requirements 
commonly required by the majority of organizations. The result is likely to be a stratification of 
the current ECM market, based on the level of content services that different organizations 
require. 

Independently of Microsoft and Oracle, open source enterprise content management systems 
have emerged to also provide basic ECM functionality. These include Alfresco, Nuxeo CPS and 
Plone. Similarly to the operating system, application server and database markets, these entrants 
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hope to apply the open source distribution model of freely available and downloadable software 
to compete against the traditional enterprise software sales model of the incumbent ECM 
vendors and commoditize the ECM market. 

The need for scalability and scanning facilities for hundreds of millions of documents 
requiring Terabyte,Petabyte or Exabyte filestores that are in compliance with existing and 
emerging standards such as HIPAA, SAS 70, BS 7799 and ISO/IEC 27001 may make 
outsourcing to certified end to end service providers a viable alternative. 


Characteristics 


Content management has many facets including enterprise content management, Web 
content management (WCM), content syndication and digital or media asset management. 
Enterprise content management is a vision, a strategy, or even a new industry, but it is not a 
closed system solution or a distinct product. Therefore, along with DRT (Document Related 
Technologies) or DLM (Document Lifecycle Management), ECM can be considered as just one 
possible catch-all term for a wide range of technologies and vendors. 

A comparison of the definitions of the different application fields of ECM and WCM makes 
it clear that the existing system category distinctions cannot last long, whether for products and 
technical platforms or for usage models. Solutions that are used as pure in-house solutions today 
will be made accessible to partners or customers tomorrow. The content and structure of today's 
outward-directed web portal will be the platform for tomorrow's internal information system. In 
his article in ComputerWoche*!, Ulrich Kampffmeyer concentrated the claimed benefit of an 
enterprise content management system to three key ideas that distinguish such solutions from 
Web content management: 

"Enterprise Content Management as integrative middleware 

ECM is used to overcome the restrictions of former vertical applications and island 

architectures. The user is basically unaware of using an ECM solution. ECM offers the 

requisite infrastructure for the new world of web-based IT, which is establishing itself 
as a kind of third platform alongside conventional host and client/server systems. 

Therefore, EAI Enterprise Application Integration and SOA Service Oriented 

Architecture will play an important role in the implementation and use of ECM. 
Enterprise Content Management components as independent services 

ECM is used to manage Information without regard to the source or the required use. 

The functionality is provided as a service that can be used from all kinds of applications. 

The advantage of a service concept is that for any given functionality only one general 

service is available, thus avoiding redundant, expensive and difficult to maintain 

parallel functions. Therefore, standards for interfaces connecting different services will 
play an important role in the implementation of ECM. 
Enterprise Content Management as an uniform repository for all types of information 

ECM is used as a content warehouse (both data warehouse and document warehouse) 

that combines company information in a repository with a uniform structure. Expensive 

redundancies and associated problems with information consistency are eliminated. All 
applications deliver their content to a single repository, which in turn provides needed 
information to all applications. Therefore, Content Integration and ILM Information 
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Lifecycle Management will play an important role in the implementation and use of 
ECM. 


Enterprise Content Management is working properly when it is effectively "invisible" to 
users. ECM technologies are infrastructures that support specialized applications as 
subordinate services. ECM thus is a collection of infrastructure components that fit into a 
multi-layer model and include all Document Related Technologies (DRT) for handling, 
delivering, and managing structured data and unstructured information jointly. As such, 
Enterprise Content Management is one of the necessary basic components of the 
overarching E-Business application area. ECM also sets out to manage all the information of 
a WCM and covers archiving needs as an universal repository."[6] 


Components of an enterprise content management system 


Enterprise content management systems combine a wide variety of technologies and 
components, some of which can also be used as stand-alone systems without being incorporated 
into an enterprise-wide system. 

These ECM components and technologies are categorized as: 


e Capture 
e Manage 
e Store 

e Preserve 
e Deliver. 


This model is based on the five lead categories of AIIM International. 

The traditional application areas are: 

e Document management (DM), 
Collaboration (or collaborative software, groupware), 
Web content management (WCM) (including web portals), 
Records management (RM) (archive and filing management systems on long- 
term storage media) and 
Workflow / Business process management (BPM) 

These form the "Manage" components that connect Capture, Store, Deliver and Preserve and 
can be used in combination or separately. While Document Management, Web Content 
Management, Collaboration, Workflow and Business Process Management are more for the 
dynamic part of the life cycle of information, Records Management takes care of information 
which will no longer be changed. The utilization of the information is paramount throughout, 
whether through independent clients of the ECM system components, or by enabling existing 
applications that access the functionality of ECM services and the stored information. The 
integration of existing technologies makes it clear that ECM is not a new product category, but 
an integrative force. 

The individual categories and their components will be examined in the following. 


Capture 
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The "Capture" category contains functionalities and components for generating, capturing, 
preparing and processing analog and electronic information. There are several levels and 
technologies, from simple information capture to complex information preparation using 
automatic classification. Capture components are often also called "Input" components. 

Manually generated and captured information 

Manual capture can involve all forms of information, from paper documents to electronic 
office documents, e-mails, forms, multimedia objects, digitized speech and video, and microfilm. 

Automatic or semi-automatic capture can use EDI or XML documents, business and ERP 
applications or existing specialist application systems as sources. 

Technologies for processing captured information 

Various recognition technologies are used to process scanned documents and digital faxes, 
among them: 

Optical character recognition (OCR) 

This converts image information into machine-readable characters. OCR is used for 

type. 

Handprint Character Recognition (HCR)) 

This refinement of OCR converts handwriting or lettering into machine characters, but 

does not yet give satisfactory results for running text. However, for defined field 

content, it has become very reliable. 
Intelligent Character Recognition (ICR) 

ICR is a further development of OCR and HCR, that uses comparison, logical 

connections, and checks against reference lists and existing master data to improve 

results. 
Optical Mark Recognition (OMR) 

OMR, as used for checkboxes for example, reads special markings in predefined fields 

with very high accuracy. It has proven its value in questionnaires and other forms. 
Barcode 

Barcodes on mailed forms allow for the automatic recognition and filing of returns. 
Document Imaging 

Document imaging processing techniques are used to show scanned images, and also allow 
legibility enhancement for capture. Functions like "despeckling," which removes isolated pixels, 
or "adjustment," which straightens images from sheets that feed in at an angle, improve the 
results of recognition technologies. Document imaging functions are used in capture quality 
control. 

Forms processing 

In forms capture, there are two groups of technologies, although the information content and 
character of the documents may be identical. 
Paper Forms 

Forms Processing means the capture of industrially or individually printed forms via 

scanning. Recognition technologies are often used here, since well-designed forms 

enable largely automatic processing. 
E-Forms / Web-Forms 

Automatic processing can be used to capture electronic forms as long as the layout, 

structure, logic and contents are known to the capture system. 
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COLD 
COLD/ERM are technologies for the automatic processing of structured entry data. COLD 
stands for Computer Output to Laser Disk and is still in use although laser disks have not been 
on the market for years. The acronym ERM here stands for Enterprise Report Management. In 
both, supplied output data is processed based on existing structure information in such a way that 
it can be indexed independently of the origination system, and transferred to a storage 
component that can be dynamic (Store) or an archive (Preserve). 
Aggregation 
Is a process of combining data entries from different creation, capture, and delivery 
applications. The goal is to combine and unify data from different sources, in order to pass them 
on to storage and processing systems with a uniform structure and format. 
Components for subject indexing of captured information 
Systems incorporate further components for subject indexing and getting captured digital 
information to the appropriate recipients. These include: 
Indexing (manual) 
In English parlance, indexing refers to the manual assignment of index attributes used 
in the database of a "manage" component for administration and access. 
Input Designs (profiles) 
Both automatic and manual attributing can be made easier and better with preset 
profiles. These can describe document classes that limit the number of possible index 
values, or automatically assign certain criteria. Input designs also include entry masks 
and their logic in manual indexing. 
Categorization (automatic classification or categorizing) 
Based on the information contained in electronic information objects, whether OCR- 
converted faxes, office files or output files, automatic classification programs can extract 
index, category, and transfer data autonomously. These systems can evaluate 
information based on predefined criteria or in a self-learning process. 
The objective of all "Capture" components is the provision of information to the "Manage" 
components for further processing or archiving. 


Manage 


The Manage components are for the management, processing, and use of information. They 
incorporate: 

e Databases for administration and retrieval, and 
e Access authorization systems. 

The goal of a closed ECM system is to provide these two components just once as services 
for all "Manage" solutions such as Document Management, Collaboration, Web Content 
Management, Records Management and Workflow / Business Process Management. To link the 
various "Manage" components, they should have standardized interfaces and secure transaction 
processes for inter-component communication. 

DM — Document Management 

Document management in this context does not refer to the industry known in Europe as 

DMS, but to document management systems in the narrower "classical" sense. These systems 
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control documents from their creation through to long-term archiving. Document management 
includes functions like: 
e Check in/Check out for checking stored information for consistency 
e Version management to keep track of different versions of the same information 
with their revisions and renditions (same information in a different format) 
e Search and navigation for finding information and its associated contexts 
e Visualizing for showing information in structures like virtual files, folders, and 
overviews 

However, the functions or Document Management increasingly overlap with those of the 
other "Manage" components, the ever-expanding functionalities of office applications like 
Outlook/Exchange or Notes/Domino, and the characteristics of "Library Services" for 
administering information storage. 

Collaboration (collaborative systems, groupware) 

Collaboration simply means "working together." However, these solutions, which developed 
from conventional groupware, now go much further and include elements of Knowledge 
Management. Collaboration includes the following functions: 

e Jointly usable information databases 

e Joint, simultaneous, controlled information processing 

e Knowledge based on skills, resources and background data for joint information 
processing 

e Administration components such as_ whiteboards for brainstorming, 
appointment scheduling, project management etc. 

e Communication application such as video conferencing 

e Integration of information from other applications in the context of joint 
information processing 

WCM — Web Content Management 

Enterprise Content Management claims to integrate Web Content Management. However, 
information presented on the Internet and Extranet or on a portal should only be data that is 
already present in the company, whose delivery is controlled by access authorization and storage. 
Web Content Management includes the following functions, among others: 

e Creation of new or editing of existing information in a controlled generation and 
publishing process 

e Delivery and administration of information for the web presentation 

e Automatic conversion for various display formats, personalized display and 
versions 

e Secure separation of access to public and non-public information 

e Visualization for Internet presentation (browser, HTML, XML etc.) 

RM — Records Management (file and archive management) 

Unlike with traditional electronic archival systems, Records Management] (RM; Electronic 
Records Management or ERM) refers to the pure administration of records, important 
information and data that companies are required to archive. Records Management is 
independent of storage media, and can also manage information stored otherwise than in 
electronic systems. Among the functions of Records Management are: 
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e Visualisation of file plans and other structured indexes for the orderly storage of 
information 

e Unambiguous indexing of information, supported by thesauri or controlled 
wordlists 

e Management of record retention schedules and deletion schedules 

e Protection of information in accordance with its characteristics, sometimes 
down to individual content components in documents 

e Use of international, industry-specific or at least company-wide standardized 
meta-data for the unambiguous identification and description of stored 
information 

Wf — Workflow / BPM — Business Process Management 

Workflow and Business Process Management differ substantially. 

There are different types of Workflow, for example: 

e "Production Workflow" which uses predefined sequences to guide and control 
processes 

e "Ad-Hoc Workflow" in which the user determines the process sequence on the 
fly. 

Workflow solutions can be implemented as: 

e "Workflow solutions" with autonomous clients which users mostly work with, 
or as 

e "Workflow Engines" which act as a background service controlling the 
information and data flow, without requiring an own client for this. 

Workflow Management includes the following functions, among others: 

e Visualisation of process and organization structures 

e Capture, administration, visualization, and delivery of grouped information with 
its associated documents or data 

e Incorporation of data processing tools (such as specific applications) and 
documents (such as office products) 

e Parallel and sequential processing of procedures including simultaneous saving 

e Reminders, deadlines, delegation and other administration functionalities 

e Monitoring and documentation of process status, routing, and outcomes 

e Tools for designing and displaying process 

The objective is to largely automate processes by incorporating all necessary resources. 

BPM or Business Process Management goes a step further than Workflow. Although the 
words are often used interchangeably. BPM aims at the complete integration of all affected 
applications within an enterprise, with monitoring of processes and assembling of all required 
information. Among BPM's functions are: 

e Complete workflow functionality 

e Process and data monitoring at the server level 

e EAI or Enterprise Application Integration, to link different applications 

e BI or Business Intelligence, with rule structures, integration of information 
warehouses, and utilities that assist users in their work. 

Today, "Manage" components are offered individually or integrated as suites. In many cases 
they already include the "Store" components. 
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Store 


"Store" components are used for the temporary storage of information which it is not 
required or desired to archive. Even if it uses media that are suitable for long-term archiving, 
"Store" is still separate from "Preserve." 

The "Store" components listed by AIIM can be divided into three categories: "Repositories" 
as storage locations, "Library Services" as administration components for repositories, and 
storage "Technologies." These infrastructure components are sometimes held at the operating 
system level like the file system, and also include security technologies which will be discussed 
farther below in the "Deliver" section. However, security technologies including access control 
are superordinated components of an ECM solution. 

Repositories 

Different kinds of ECM repositories can be used in combination. Among the possible kinds 
are: 

File Systems 

File systems are used primarily for temporary storage, as input and output caches. The 

goal of ECM is to reduce the data burden on the file system and make the information 

generally available through "Manage," "Store" and "Preserve" technologies. 

Content Management Systems 
This is the actual storage and repository system for content, which can be a database or 
a specialized storage system. 

Databases 

Databases administer access information, but can also be used for the direct storage of 

documents, content, or media assets. 
Data Warehouses 

These are complex storage systems based on databases, which reference or provide 

information from all kinds of sources. They can also be designed with more global 

functions such as Document or Information Warehouses. 
Library Services 

Library Services have to do with libraries only in a metaphorical way. They are the 
administrative components close to the system that handle access to information. The Library 
Service is responsible for taking in and storing information from the Capture and Manage 
components. It also manages the storage locations in dynamic storage, the actual "Store," and in 
the long-term "Preserve" archive. The storage location is determined only by the characteristics 
and classification of the information. The Library Service works in concert with the database of 
the "Manage" components. This serves the necessary functions of 

e Search, and 
e Retrieval 
While the database does not "know" the physical location of a stored object, the Library 
Service manages the 
e Online storage (direct access to data and documents) 
e Nearline storage (data and documents on a medium that the drive can access, 
but for which robotics or something similar must first be set up) 
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e Farline storage (data and documents on a medium that the drive can access, but 
for which a person needs to load media) 
e Offline storage (data and documents on a medium that is removed from system 
access). 
If there is not a superordinated document management system to provide the functionality, 
the Library Service must have 
e Version management to control the status of information 
e Check-in/Check-out, for controlled information provision 
An important Library Service function is the generation of logs and journals on information 
usage and edits, called an "audit trail." 
Storage technologies 
A wide variety of technologies can be used to store information, depending on the 
application and system environment: 
Read and Write Magnetic Online Media 
This includes hard drives as RAID (Redundant Array of Independent Disks) server drive 
subsystems, Storage Area Networks (SANs) as storage infrastructures and Network- 
attached storage (NAS) as directly accessible network storage areas. 
Magnetic Tape 
In automated storage units like "Libraries" or "Silos" with robotics for access, used like 
DAT in smaller environments for backup but not online access. 
Digital Optical Media 
CD (CD-R for write-once, read-only Compact Disk, CD/RW for read-and-write Compact 
Disk), Digital Versatile Disk (DVD)), MO (Magneto Optical), and other formats can be 
used for storage and distribution, or in jukeboxes for online storage. 


Preserve 


The "Preserve" components of ECM handle the long-term, safe storage and backup of static, 
unchanging information, as well as temporary storage of information that it is not desired or 
required to archive. This is sometimes called "electronic archiving,” but that has substantially 
broader functionality than that of "Preserve." Electronic archiving systems today generally 
consist of a combination of administration software like Records Management, Imaging or 
Document Management, Library Services (IRS - Information Retrieval System) and storage 
subsystems. 

But it is not just electronic media that are suitable for long-term archiving. For purely 
securing information, microfilm is still viable, and is now offered in hybrid systems with 
electronic media and database-supported access. The decisive factor for all long-term storage 
systems is the timely planning and regular performance of migrations, in order to keep 
information available in the changing technical landscape. This ongoing process is called 
Continuous Migration. The "Preserve" components contain special viewers, conversion and 
migration tools, and long term storage media: 

Long term storage media 
WORM optical disk 
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Write Once Read Many (WORM) rotating digital optical storage media, which include 
the classic 5 %" in or 3 %" WORM disc in protective sleeve, as well as CD-R and DVD-R. 
Recording methods vary for these media, which are held in jukeboxes for online and 
automated nearline access. 

WORM tape 
Magnetic tapes with WORM characteristics are used in special drives, that can be as 
secure as a traditional WORM medium if used properly with specially secured tapes. 

WORM hard disk 
Magnetic disk storage with special software protection against overwriting, erasure, 
and editing, delivers similar security like a traditional WORM medium. An example is 
CAS Content Addressed Storage. 

Storage networks 
Storage networks like NAS Network Attached Storage and SAN Storage Area Networks 
can also be used if they meet the requirements of edit-proof auditing acceptability with 
unchangeable storage, protection against manipulation and erasure, etc. 

Microfilm 
Microforms like microfilm, aperture cards, jackets a.s.o. can be used to backup 
information that is no longer in use and does not require machine processing. 

Paper 
Paper still has applications as a long-term storage medium, since it does not require 
migration, and can be read without any technical aids. However, like microfilm it is used 
only to double secure originally electronic information. 

Long term preservation strategies 
To secure the long term availability of information different strategies are used for electronic 

archives. 

Migration 
Continuous migration of applications, index data, meta data and objects from older 
systems to new ones generates a lot of work but secures the accessibility and usability 
of information, and allows during this process the deletion of information which is no 
longer relevant. Conversion technologies are used to update the formats of the stored 
information. ; Emulation 
[Emulation|emulator] of older software allows us to run and access the original data 
and objects. As well as special viewer software which can identify the formats of the 
preserved objects and can display the objects in the new software environment. 
Standards for interfaces, meta data, data structures and object formats are important to secure 

the availability of information. 


Deliver 


The "Deliver" components of ECM are used to present information from the "Manage," 
"Store," and "Preserve" components. They also contain functions used to enter information in 
systems (such as information transfer to media or generation of formatted output files) or for 
readying (for example converting or compressing) information for the "Store" and "Preserve" 
components. Since the AIIM component model is function-based and not to be regarded as an 
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architecture, we can assign these and other components here. The functionality in the "Deliver" 
category is also known as "output" and summarized under the term "Output Management." 

The "Deliver" components comprise three groups of functions and media: Transformation 
Technologies, Security Technologies, and Distribution. Trans7formation and Security as services 
belong at the middleware level and should be available to all ECM components equally. For 
Output two functions are of primary importance: 

Layout/Design 

With tools for layouting and formatting output, and 
Publishing 

With Applications for presenting Information for distribution and publication. 
Transformation technologies 

Transformations should always be controlled and trackable. This is done by background 
services which the end user generally does not see. Among the transformation technologies are: 
COLD / ERM (Computer Output to Laser Disc) 

As distinct from "Capture" components, it prepares output data for distribution and 

transfer to the archive. Typical applications are lists and formatted output, for example 

individualized customer letters. These technologies also include journals and logs 
generated by the ECM components. Unlike most imaging media COLD records are 

indexed not in a database table but by absolute positions within the document itself (i.e. 

page 1 line 82, position 12). As a result COLD index fields are uneditable after 

submission unless they are converted into a standard database. 
Personalization 

This is no longer just a function of web-based portals, but applies to all ECM 

components. Personalization gives the user just those functions and information that he 

needs. 
XML (eXtended Markup Language) 

A description language that allows description of interfaces, structures, metadata, and 

documents. XML is becoming the universal technology for describing information. 
PDF (Portable Document Format) 

An intelligent print and distribution format that enables the platform-independent 

presentation of information. Unlike pure image formats like TIFFs, PDFs permit content 

searches, the addition of metadata, and the embedding of electronic signatures. 
XPS (XML Paper Specification) 

A XML specification developed by Microsoft describing the formats and rules for 

distributing, archiving, rendering, and processing XPS documents. 
Converters and Viewers 

Serve to reformat information to generate uniform formats, and also to display and 

output information from different formats. 
Compression 

Used to reduce the storage space needed for pictorial information. The ITU process 

(CCITT) is mainly used for b/w for TIFFs, and JPEG2000 for color images. ZIP 

applications allow the compression of any kind of data for transfer. 
Syndication 
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Used for presenting content in different formats, selections and forms in the context of 
Content Management. Syndication allows the same content to be used multiple times in 
different forms for different purposes. 
Security Technologies 
Security technologies are cross-section functions that are available to all ECM components. 
For example, electronic signatures are used not only when documents are sent, but also in data 
capture via scanning, in order to document the completeness of the capture. PKI (Public/Private 
Key Infrastructure) is a basic technology for electronic signatures. It manages keys and 
certificates, and checks the authenticity of signatures. Other electronic signatures demonstrate 
the identity of the sender and the integrity of the sent data, i.e. that it is complete and unchanged. 
In Europe there are three forms of electronic signatures, of different quality and security: simple, 
advanced, and qualified. In most European states the qualified electronic signature is legally 
admissible in legal documents and contracts. Finally, there is Digital Rights Management and 
Watermarking. This is used in Content Syndication and in MAM (Media Asset Management) for 
managing and securing intellectual property rights and copyrights. It works with techniques like 
electronic watermarks that are integrated directly into the file, and seeks to protect usage rights 
and protect content that is published on the Internet. 
Distribution 
All of the above technologies basically serve to provide the various contents of an ECM to 
target users by various routes, in a controlled and user-oriented manner. These can be active 
components such as e-mail, data media, memos, and passive publication on websites and portals 
where users can get the information themselves. Possible output and distribution media are: 
e Internet, extranet and intranet 
e E-business portals 
e Employee Portals 
e E-mail and fax 
e Data transfer by EDI, XML or other formats 
e Mobile devices like mobile phones, PDAs, and others 
e Data media like CDs and DVDs 
e Digital TV and other multimedia services 
e Paper 
The task of the various "Deliver" components is to provide information to users in the best 
way for the given application, while controlling its use as far as possible. 


Outlook 


The former member of the board of directors of AIIM international, Ulrich Kampffmeyer, 
states in his whitepaper on ECM in 2004!!: 

"Document technologies like Enterprise Content Management make traditional data 
processing complete. They bring together structured, weakly structured, and unstructured 
information. Every company, every government agency, and every organization must confront 
the subject. Even if there are no immediate plans to implement such a system, it sneaks into the 
organization of its own accord — with the next server licence update, with the next office 
software suite, with the next database or ERP upgrade. In many companies with heterogeneous 
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IT landscapes, the question of which redundant functionalities of existing products are unused is 
already more important than whether to invest in a new software system. The most important job 
is to keep in-house information under control. The questions add up: where to put the thousands 
and thousands of e-mails, what to do with the electronically signed business correspondence, 
where to put taxation-relevant data, how to transfer information from the disorganized file 
system, how to consolidate information in a repository that everybody can use, how to get a 
single login for all the systems, how to create a uniform in-basket for all incoming information, 
how to make sure that no information is lost or ignored, etc. etc. Document technologies play an 
important role in all these questions. ECM solutions are necessary basic components for many 
applications. 

Every potential user will naturally consider his own individual needs before deciding on a 
system. However, putting off decisions does not make them less necessary. Every year 
something supposedly better and easier to use will come along, but waiting will just mean never 
installing anything. Every time the decision is put off, the mountain of uncontrolled and unused 
information gets bigger, and known problems get larger. A sensible long-term migration strategy 
removes the fear of fast technology change. The basic functions of document technology are 
mature, and most products are reliable, stable, secure, and increasingly affordable. In many 
industries, the use of document technology makes the difference in staying competitive. ECM - 
Enterprise Content Management — should be a part of every modern IT infrastructure."'*! 


ECM market development 


Gartner, a leading industry analyst firm, estimated that by midyear 2006, 50 percent of ECM 
vendors will merge or be acquired (0.6 probability) - tellingly this never happened. According to 
Gartner, by 2008, 75 percent of Global 2000 companies will have a desktop-focused and a 
process-focused content management implementation (0.9 probability) and ECM will continue to 
absorb other technologies, such as digital asset management and e-mail management. Gartner 
also predicted that there will be further market consolidation, acquisition and separation of 
vendors into platform and solution providers. The reality is that ECM remains a highly 
fragmented marketplace despite persistent predictions of consolidation. 

According to Gartner reports as of 2005"!, the ECM market leaders were Open Text 
Corporation, EMC (Documentum), IBM, FileNet, Stellent and Hummingbird. In 2006 the 
consolidation process of the ECM market continued and lead to the acquisitions of Hummingbird 
by Open Text, Captiva by EMC, FileNet by IBM and Stellent by Oracle Corporation. New 
competitors in the ECM market place are Oracle, Microsoft and Pitney Bowes Group | Software. 
Other ECM vendors include Hyland Software, TOWER Software, Objective Corporation, 
Vignette, Interwoven, Xerox, Saperion and a bevy of smaller players targeting low-cost 
document and records management. Ever-Team, SunGard EXP, and Xythos Software have been 
added in the Magic Quadrant for ECM 2006". In early 2007, independent analyst firm CMS 
Watch cited substantial turbulence among major ECM vendors, suggesting that some of the 
largest players presented some of the riskiest buys.!"". 

The Web 2.0 wave has brought new players to the market with strength in web-based 
delivery. Koral and EchoSign, both available on the Salesforce.com AppExchange platform, are 
representative of this trend!). 
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Currently, Enterprise Information Management is taking a growing interest from 
organizations who are trying to aproach Information Management (whether structured or 
unstructured() from an Enterprise perspective. EIM combines ECM and Business Intelligence. 
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ITRedux. 


Content management system 


Geospatial Content Management System 
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A Geospatial Content Management System (GeoCMS) is a Content Management System 
where objects (users, images, articles, blogs..) can have a Latitude, Longitude position to be 
displayed on an online interactive map. In addition the online maps link to informational pages 
(wiki pages essentially) on the data represented. 

A GeoCMS can have a map of registered users allowing to build communities 
geographically, by looking at users location. The help of wiki for describing geographical layers 
present a simple way to solve the problem of geographical metadata. 

Since the advent of Google Maps and the publication of its API, numerous users have used 
online maps to illustrate their web pages. Google Maps is in itself not a GeoCMS but is a 
powerful building block. 

Similarly Mapserver is also a powerful building block for creating GeoCMS. 

TikiWiki was the first CMS to become a GeoCMS in early 2003, it is now becoming popular 
on various other CMS especially since the publication of the Mapserver API in 2002 to many 
scripting languages: PHP, Python, Perl, Java... 


GeoCMS List 


Drupal - see geo modules 

Midgard CMS - see Positioning module 
Plone - see geo plugins 

TikiWiki - see TikiMaps 

WordPress - see GeoPress 


GeoCMS comparison 


Drupal Midgard Plone TikiWiki WordPress 

Ability to store 
locations 
Number of locations Multiple iniabautiat See specific fl 
per content item plugin 

: Google Maps, Yahoo . See specific Google 
Maps on content items Maps, MapBuilder i aa can pete Maps 
Syndication formats GeoRSS, KML GeoRSS att a GeoRSS 


Web content 


Web content is the textual, visual or aural content that is encountered as part of the user 
experience on websites. It may include, among other things: text, images, sounds, videos and 
animations. 

In "Information Architecture for the World Wide Web" (second edition, page 219), Lou 
Rosenfeld and Peter Morville write, "We define content broadly as 'the stuff in your Web site.’ 
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This may include documents, data, applications, e-services, images, audio and video files, 
personal Web pages, archived e-mail messages, and more. And we include future stuff as well as 
present stuff." 


Beginnings of web content 


While the Internet began with a U.S. Government research project in the late 1950s, the web 
as we know it today did not appear on the Internet until after Tim Berners-Lee and his colleagues 
at the European laboratory (CERN) proposed the concept of linking documents with hypertext. 
But it was not until Mosaic, the forerunner of the famous Netscape Navigator, appeared did the 
Internet become more than a file serving system. 

The use of hypertext, hyperlinks and a page-based model of sharing information, introduced 
with Mosaic and later Netscape, helped to define web content, and the formation of websites. 
Largely, today we categorize websites as being a particular type of website according to the 
content a website contains. 


The page concept 


Web content is dominated by the "page" concept. Having its beginnings in an academic 
settings, and in a setting dominated by type-written pages, the idea of the web was to link from 
directly in an academic paper to another academic paper. This was a completely revolutionary 
idea in the late 1980s and early 1990s when the best a link could be made was to cite a reference 
in the midst of a type written paper and name that reference either at the bottom of the page or on 
the last page of the academic paper. 

When it was possible for any person to write and own a Mosaic page, the concept of a 
"Home Page" blurred the idea of a page. It was possible for anyone to own a "Web page" or a 
"Home Page" which in many cases the website contained many physical pages in spite of being 
called "a page". People often cited their "Home Page" to provide credentials, links to anything 
that a person supported, or any other individual content a person wanted to publish. (See a home 
page idea of Edmund S. Phelps). 

Even though "the web" may be the resource we commonly use to "get to" particular locations 
online, many different protocols are invoked to access embedded information. When we are 
given an address, such as http://www.youtube.com, we expect to see a range of web pages, but in 
each page we have embedded tools to view "video clips". 


HTML web content 


Even though we may embed various protocols within web pages, the "web page" composed 
of "html" (or some variation) content is still the dominant way whereby we share content. And 
while there are many web pages with localized proprietary structure (most usually, business 
websites), many millions of websites abound that are structured according to a common core 
idea. 

Blogs are a type of website that contains mainly web pages authored in html (although the 
blogger may be totally unaware that the web pages are composed using html due to the blogging 
tool that may be in use). Millions of people use blogs online; a blog is now the new "Home 
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Page", that is, a place where a persona can reveal personal information, and/or build a concept as 
to who this persona is. Even though a blog may be written for other purposes, such as promoting 
a business, the core of a blog is the fact that it is written by a "person" and that person reveals 
information from her/his perspective. 

Search engine sites are composed mainly of html content, but also has a typically structured 
approach to revealing information. A Search Engine Results Page (SERP) displays a heading, 
usually the name of the Search Engine, and then a list of Websites and their addresses. What is 
being listed are the results from a query that may be defined as keywords. The results page lists 
webpages that are connected in some way with those keywords used in the query. 

Discussion boards are sites composed of "textual" content organized by html or some 
variation that can be viewed in a web browser. The driving mechanism of a discussion board is 
the fact that users are registered and once registered can write posts. Often a discussion board is 
made up of posts asking some type of question to which other users may provide answers to 
those questions. 

Ecommerce sites are largely composed of textual material and embedded with graphics 
displaying a picture of the item(s) for sale. However, there are extremely few sites that are 
composed page-by-page using some variant of HTML. Generally, webpages are composed as 
they are being served from a database to a customer using a web browser. However, the user sees 
the mainly text document arriving as a webpage to be viewed in a web browser. Ecommerce sites 
are usually organized by software we identify as a "shopping cart". 


A wider view of web content 


While there are many millions of pages that are predominantly composed of HTML, or some 
variation, in general we view data, applications, e-services, images (graphics), audio and video 
files, personal Web pages, archived e-mail messages, and many more forms of file and data 
systems as belonging to websites and web pages. 

While there are many hundreds of ways delivering information on a website, there is a 
common body of knowledge of Search engine optimization that needs to be read as advisory of 
ways that anything else but text should be delivered. Currently search engines are text based and 
are one of the common ways people using a browser location sites of interest. 


Content is king 


A current meme when organizing or building a website is the catchwords "Content is King" 
(although Andrew Odlyzko in "Content is Not King" argues otherwise). What is meant by the 
term "content" is written text in plain vanilla HTML or a variant that produces good clean text 
that can be indexed with ease by a search engine. 

This argument is valid to a greater extent, not because people will find it interesting and 
useful, or a good enough description to buy the product online, but because search engines can 
index text easily and if the information is close to what a searcher is seeking can be delivered as 
a site for the seeker of information. Textual information is therefore "king" online to aid the 
rather raw search tools to operate rather than actually to be impelling for people when seeking 
information. 
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Content management 


Because websites are often complex, a term "content management" appeared in the late 
1990s identifying a method or in some cases a tool to organize all the diverse elements to be 
contained on a website. Content management often means that within a business there is a range 
of people who have distinct roles to do with content management, such as content author, editor, 
publisher, and administrator. But it also means there may be a content management system 
whereby each of the different roles are organized whereby to provide their assistance in 
operating the system and organizing the information for a website. 

Even though a business may organize to collect, contain and represent that information 
online, content needs organization in such a manner to provide the reader (browser) with an 
overall "customer experience" that is easy to use, the site can be navigated with ease, and the 
website can fulfill the role assigned to it by the business, that is, to sell to customers, or to market 
products and services, or to inform customers. 


See also 


Groupware 
Web design 
Web development 
Web service 
Web syndication 
Web template 
World Wide Web Consortium (Web standards) 
Web content management system 


Web content management system 


A web content management system is content management system software implemented as a 
web application used for creating and managing HTML content. It is used to manage and control 
a large, dynamic collection of web material (HTML documents and their associated images). A 
CMS facilitates content creation, content control, editing, and many essential web maintenance 
functions. 

Usually the software provides tools where users with little or no knowledge of programming 
languages and markup languages can create and manage content with relative ease of use. Most 
systems use a database to store content, and a presentation layer displays the content to regular 
website visitors based on a set of templates. Administration and content creation is typically 
done through the web application, but some CMS may be modified in other ways. 

A Content Management System (CMS) differs from website builders like Microsoft 
FrontPage or Macromedia Dreamweaver. A CMS allows non-technical users to make changes to 
an existing website with little or no training. Web CMS typically require an experienced coder to 
set-up and add features, but it is primarily a website maintenance tool for non-technical 
administrators. 
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Web content management systems capabilities 


A web content management system is a software system used to manage and control a large, 
dynamic collection of web material (HTML documents and their associated images). A CMS 
facilitates document control, auditing, editing, and timeline management. A Web CMS provides 
the following key features: 

Automated templates 

Create standard visual templates that can be automatically applied to new and existing 

content, creating one central place to change that look across a group of content on a 

site. 

Easily editable content 

Once your content is separate from the visual presentation of your site, it usually 

becomes much easier and quicker to edit and manipulate. Most CMS software include 

WYSIWYG editing tools allowing non-technical individuals to create and edit content. 
Scalable feature sets 

Most CMS have plug-ins or modules that can be easily installed to extend an existing 

site's functionality. 
Web standards upgrades 

Active CMS solutions usually receive regular updates that include new feature sets and 

keep the system up to current web standards. 
Workflow management 

Workflow is the process of creating cycles of sequential and parallel tasks that must be 

accomplished in the CMS. For example, a content creator submits a story but it's not 

published on the website until the copy editor cleans it up, and the editor-in-chief 
approves it. 
Document management 

CMS solutions may provide a means of managing the life cycle of a document from 

initial creation time, through revisions, publication, archive, and document destruction. 
Various products includes Teamsite. 


See also 


e Content management systems 
e Comparison of content management systems 


HTML editors 


An HTML editor is a software application for creating web pages. Although the HTML 
markup of a web page can be written with any text editor, specialized HTML editors can offer 
convenience and added functionality. For example, many HTML editors work not only with 
HTML, but also with related technologies such as CSS, XML and JavaScript or ECMAScript. In 
some cases they also manage communication with remote web servers via FTP and WebDAV, 
and version management systems such as CVS or Subversion. The first full featured text HTML 
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editor available for download on the Internet was the CoffeeCup HTML Editor begun in 1994 by 
Nicholas Longo and Kevin Jurica of CoffeeCup Software. 


Types 


There are various forms of HTML editors: text, object and WYSIWYG (What You See Is 
What You Get) editors. 


Text editors 


Text (source) editors intended for use with HTML usually provide syntax highlighting. 
Templates, toolbars and keyboard shortcuts may quickly insert common HTML elements and 
structures. Wizards, tooltip prompts and auto-completion may help with common tasks. 

Text HTML editors commonly include either built-in functions or integration with external 
tools for such tasks as source and version control, link-checking, code checking and validation, 
code cleanup and formatting, spell-checking, uploading by FTP or WebDAV, and structuring as 
a project. 

Text editors require user understanding of HTML and any other web technologies the 
designer wishes to use like CSS, JavaScript and server-side scripting languages. 


Object editors 


Some editors allow alternate editing of the source text of objects in more visually organized 
modes than simple color highlighting, but in modes not considered WYSIWYG. Some 
WYSIWYG editors include the option of using palette windows that enable editing the text- 
based parameters of selected objects. These palettes allow either editing parameters in fields for 
each individual parameter, or text windows to edit the full group of source text for the selected 
object. They may include widgets to present and select options when editing parameters. Adobe 
GoLive provides an outline editor to expand and collapse HTML objects and properties, edit 
parameters, and view graphics attached to the expanded objects. 


WYSIWYG HTML editors 


WYSIWYG HTML editors provide an editing interface which resembles how the page will 
be displayed in a web browser. Some editors, such as ones in the form of browser extensions 
allow editing within a web browser. Because using a WYSIWYG editor does not require any 
HTML knowledge, they are easier for an average computer user to get started with. 

The WYSIWYG view is achieved by embedding a layout engine based upon that used in a 
web browser. The layout engine will have been considerably enhanced by the editor's developers 
to allow for typing, pasting, deleting and moving the content. The goal is that, at all times during 
editing, the rendered result should represent what will be seen later in a typical web browser. 

While WYSIWYG editors make web design faster and easier; many professionals still use 
text editors, despite the fact that most WYSIWYG editors have a mode to edit HTML code by 
hand. The web was not originally designed to be a visual medium, and attempts to give authors 
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more layout control, such as css, have been poorly supported by major web browsers. Because of 
this, code automatically generated by WYSIWYG editors frequently sacrifice file size and 
compatibility with fringe browsers, to create a design that looks the same for widely used 
desktop web browsers. This automatically generated code may be edited and corrected by hand. 
For more on subject, see Difficulties in achieving WYSIWYG below. 
WYSIWYM editors 

What You See Is What You Mean (WYSIWYM) is an alternative paradigm to the 
WYSIWYG editors above. Instead of focusing on the format or presentation of the document, it 
preserves the intended meaning of each element. For example, page headers, sections, 
paragraphs, etc. are labeled as such in the editing program, and displayed appropriately in the 
browser. 


Valid HTML code 


HTML is a structured markup language. There are certain rules on how HTML must be 
written if it is to conform to W3C standards for the World Wide Web. Following these rules 
means that web sites are accessible on all types and makes of computer, to able-bodied and 
people with disabilities, and also on wireless devices like mobile phones and PDAs, with their 
limited bandwidths and screen sizes. 

Unfortunately most HTML documents on the web are not valid according to W3C standards. 
According to one study only about | out of 141 is valid. Even those syntactically correct 
documents may be inefficient due to an unnecessary use of repetition, or based upon rules that 
have been deprecated for some years. Current W3C recommendations on the use of CSS with 
HTML were first formalised by W3C in 1996/4] and have been revised and refined since then. 
See CSS, XHTML, W3C's current CSS recommendation and W3C's current HTML 
recommendation. 

These guidelines emphasise the separation of content (HTML or XHTML) from style (CSS). 
This has the benefit of delivering the style information once for a whole site, not repeated in each 
page, let alone in each HTML element. WYSIWYG editor designers have been struggling ever 
since with how best to present these concepts to their users without confusing them by exposing 
the underlying reality. Modern WYSIWYG editors all succeed in this to some extent, but none of 
them has succeeded entirely. 

People who use text editors can generally fix such problems immediately, once they become 
aware of them. People find it frustrating when such errors come from WYSIWYG editors. 

However a web page was created or edited, WYSIWYG or by hand, in order to be successful 
among the greatest possible number of readers and viewers, as well as to maintain the 
‘worldwide' value of the Web itself it can be argued that, first and foremost, it should consist of 
valid markup and code. Some would argue that it should not be delivered by a designer to his or 
her customer, and not be considered ready for the World Wide Web, until its HTML and CSS 
syntax has been successfully validated using either the free W3C validator services (W3C 
HTML Validator and W3C CSS Validator) or some other trustworthy alternatives. 

Others would argue that publishing useful information, as soon as possible, should be first 
and foremost. 
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Whatever software tools are used to design, create and maintain web pages, there is little 
doubt that the quality of the underlying HTML is dependent on the skill of the person who works 
on the page. Some knowledge of HTML, CSS and other scripting languages as well as a 
familiarity with the current W3C recommendations in these areas will help any designer produce 
better web pages, with a WYSIWYG HTML editor and without. 


Difficulties in achieving WYSIWYG 


A given HTML document will have an inconsistent appearance on various platforms and 

computers for several reasons: 

Different browsers and applications will render the same markup differently. 
The same page may display slightly differently in Internet Explorer and Firefox on a 
high-resolution screen, but it will look very different in the perfectly valid text-only 
Lynx browser. It needs to be rendered differently again on a PDA, an internet-enabled 
television and on a mobile phone. Usability in a speech or braille browser, or via a 
screen-reader working with a conventional browser, will place demands on entirely 
different aspects of the underlying HTML. Printing the page, via different browsers and 
different printers onto various paper sizes, around the world, places other demands. 
With the correct use of modern HTML and CSS there is no longer any need to provide 
‘Printable page' links and then have to maintain two versions of the whole site. Nor is 
there any excuse for pages not fitting the user's preferred paper size and orientation, or 
wasting ink printing solid background colours unnecessarily, or wasting paper 
reproducing navigation panels that will be entirely useless once printed out[7]. 

Browsers and computer graphics systems have a range of user settings. 
Resolution, font size, colour, contrast etc can all be adjusted at the user's discretion, and 
many modern browsers allow even more user control over page appearance[8]. All an 
author can do is suggest an appearance. 

Web browsers, like all computer software, have bugs 
They may not conform to current standards. It is hopeless to try to design Web pages 
around all of the common browsers current bugs: each time a new version of each 
browser comes out, a significant proportion of the World Wide Web would need re- 
coding to suit the new bugs and the new fixes. It is generally considered much wiser to 
design to standards, staying away from 'bleeding edge' features until they settle down, 
and then wait for the browser developers to catch up to your pages, rather than the 
other way round[9]. In this regard, no one can argue that CSS/XHTML is still 'cutting 
edge’ as there is now widespread support available in common browsers for all the 
major features[10], even if many WYSIWYG and other editors have not yet entirely 
caught up[11]. 
What you see may be what most visitors get, but it is not guaranteed to be what everyone 

gets. 


Comparison of HTML editors 
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The following tables compare general and technical information for a number of 
(purportedly) WYSIWYG HTML editors. Please see the individual products’ articles for further 
information, and Comparison of text editors for information on text editors, many of which have 
features to assist with writing HTML. This article is not all-inclusive or necessarily up-to-date. 


General information 


Basic general information about the software: creator/company, license/price etc. 


Editor Version Creator 
Amaya 9.54 W3C, INRIA 
Aptana ie Aptana 
Blaze Composer 3.0 Nikhil Baliga 
CoffeeCup HTML 3007 CoffeeCup 
Editor Software 
Adobe Systems 
Contribute 4 (formerly 
Macromedia) 
Adobe Systems 
Dreamweaver CS3 (9.0) (formerly 
Macromedia) 
Evrsoft First Page 2006 Evrsoft 
FrontPage : 
(Diseontinustl) 2003 Microsoft 
Adobe Systems 
GoLive 9.0 (formerly GoLive 
Systems) 
Adobe Systems 
HomeSite 33 (formerly 
Macromedia) 
HTML-Kit 292 Chami.com 
Microsoft 
Expression Web2006 Microsoft 
Designer 
Mozilla Composer 1.7.13 Mone 
Foundation 


Nvu 1.0 
OPEN BEXI 16 
HTML BUILDER ~ 
Rapid Weaver 3.5.1 


SeaMonkey 1.1.1 


Daniel Glazman 
arcaz} 


Realmac Software 
SeaMonkey 
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Cost (USD) Software license Website 
Free W3C 

Free EPL 

Free Closed source 

US $49 Closed source 

US $149 Closed source 


US $399 Closed source 
US $59.95 Closed source 
US $199 Closed source 


US $399 Closed source 


USD $99 EURAdobe Acrobat 
€123 License 


Free to use, 


optional 

Registration: US Closet sulce 
$79 

US $299 Closed source 
Free MPL/GPL/LGPL 
Free MPL/GPL/LGPL 
Free GPL 


$39.95 Closed source 
Free MPL/GPL/LGPL 
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Composer 
Serif WebPlus 10 
SiteSpinner 2.7 


Operating system support 


Editor Windows 
Amaya Yes 
Aptana Yes 
Blaze Composer Yes 
CoffeeCup 

HTML Editor °° 
Contribute Yes 


Dreamweaver Yes 


Evrsoft First 

Yes 
Page 
GoLive Yes 
HTML-Kit Yes 
Mozilla Terminated 
Composer (1.7.13) 
Microsoft Terminated 
FrontPage (2003) 
Nvu 2! Yes 
SeaMonkey Yes 
Composer 


SiteSpinner Yes 


Editor features 

‘ FTP 
Editor Upload 
Amaya No 
Aptana Yes 
CoffeeCup HTML 
Editor = 
Dreamweaver Yes 
Evrsoft First Page Yes 
GoLive Yes 
HTML-Kit Yes 
Microsoft FrontPage Yes 
Nvu Yes 
SeaMonkey Yes? 


Council 
Serif 
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US $79.99 
Virtual Mechanics US $49.00 


Closed source 
Closed source 


Mac OS X As OP tans BSD Unix 
Yes No Yes source only source only 
Yes No Yes No No 
No No No No 
No No No No 
Yes No No No No 
Yes Dropped No No No 
No No No No 
Yes No No No No 
No No No No 
Terminated Dropped Terminated Terminated Terminated 
(1.7.13) (1.2.1) (1.7.13) (1.7.13) (1.7.13) 
No No No No 
Yes No? Yes Yes No 
Yes No Yes Yes Yes 
No No No No 
Server-side Shared Spell Tedalaies 
scripting editing checking P 
No No Yes No 
No Yes No pate oo 
Snippets) 
No No Yes Yes 
Yes Yes Yes Yes 
Yes No Yes Yes 
Yes No Yes Yes 
Yes No Yes Yes 
Yes Yes Yes Yes 
No No Yes Yes 
No No Yes No 


NICOLAE SFETCU: WEB DESIGN & DEVELOPMENT 


Composer 

SiteSpinner Yes No No Yes Yes 

Web technology support 

Editor css2 Frame fava JavaScrip XSL XHTM MathM XForm RS Ato 
s t T L: L s S m 

Amaya i No No No No Yes Yes No No No 

Aptana Yes Yes Yes Yes No Yes No No No No 

eae Yes Yes Yes Yes Yes Yes Yes No No No 

Composer 

CoffeeCup 

HTML Yes Yes No Yes No Yes No No Yes Yes 

Editor 

a Yes Yes No Yes Yes Yes Partial ? Yes ? 

Evreon a Yes Sane Yes Yes Yes No No No No 

Page ] 

FrontPage a Yes Yes Yes Yes Yes Yes No No No 

GoLive Yes Yes Yes Yes No Yes No No No No 

Nvu Yes No No Yes No Yes No No No No 

pen Monkey No Yes No_ Yes No No No No No No 

Composer 

SiteSpinner Yes Yes No Yes No No No No No No 

Image format support 

Editor JPEG GIFPNG MNG TIFFSVG PDF 

Amaya Yes Yes Yes ? Yes Yes No 

Aptana Yes Yes Yes No (coming soon) Yes Depends Depends 

CoffeeCup HTML Editor Yes Yes Yes No Yes No No 

Dreamweaver Yes Yes Yes No No No ? 

Evrsoft First Page Yes Yes Yes No Yes No No 

FrontPage! Yes Yes Yes Depends Yes? Depends Depends 

GoLive Yes Yes Yes No No Yes No 

Nvu Yes Yes Yes No No No No 

SeaMonkey Composer Yes Yes Yes No No No No 

SiteSpinner Yes Yes Yes No Yes Yes No 

See also 


e List of HTML editors 


e Website builder 
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References 


1. “ A version for Mac OS was released around 1998, but has since been 
discontinued. 


2. “ Nvu can be built successfully on any platform with the Netscape Portable 
Runtime. 

3. “ Actual amount of supported formats depend on installed converter. FrontPage 
uses Microsoft Office converters. 


List of HTML editors 


The following is a list of HTML editors with articles in Wikipedia. 
Text editors 


Plain text editors may be used to produce webpages. 
The following are some commonly used text editors: 
e Emacs 

gedit 

Kate 

Notepad 

TextEdit 

UltraEdit 

vi 


Text-based HTML editors 


Text-based HTML editors evolved from basic text editors, but include additional tools 
specifically geared toward handling code. 
e Alleycode HTML Editor 
Aptana 
Arachnophilia 
BBEdit 
Bluefish 
CoffeeCup HTML Editor 
EditPlus 
Evrsoft 1st Page 
HTML-Kit 
Macromedia HomeSite 
NoteTab 
PSPad 
Quanta Plus 
SCREEM 
Siteaid 
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skEdit 

Taco HTML Edit 
TextMate 
TopStyle 
Notepad++ 


Word processors 


While word processors are not ostensibly HTML editors, many of the major products are 
capable of exporting document layouts in HTML format. This offers the ease of use of a word 
processor, similar to a WYSIWYG product (see below), but has some of the same end product 
limitations. 

e AbiWord 
Apple Pages (part of the iWork suite) 
AppleWorks 
Microsoft Word 
OpenOffice.org Writer 
WordPerfect 


WYSIWYG editors 


WYSIWYG (What You See Is What You Get) code generators offer speed and ease of use. 

Many of these editors do not require any knowledge of the programming languages 
generated by the software. 

Pages are often produced in a proprietary format and then exported as finished code; the 
exported pages cannot usually be imported for further editing, but new pages can be exported 
from the original proprietary file after it has been edited. 

Exported files tend to be larger than hand-coded pages (those produced with a text-based 
HTML editor or a plain text editor). 

WYSIWYG generators tend to be better than word processors at producing highly graphical 
and interactive pages. 

e Adobe Contribute "Dreamweaver Lite" 
Adobe Dreamweaver - Previously Macromedia Dreamweaver. 
Adobe GoLive 
Amaya 
Blockstar 
Bluevoda 
HotDog 
iWeb 
KompoZer 
Media Lab SiteGrinder 
Microsoft Expression Web 
Microsoft SharePoint Designer 
Microsoft Visual Studio / ASP.NET Web Matrix 
Microsoft Visual Web Developer 
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NetObjects Fusion 
Nvu 
Quanta Plus 
RapidWeaver 
Sandvox 
SeaMonkey Composer 
Softpress Freeway 
WorldWideWeb 
Editors that have been discontinued, but may be in common use 
e AOLpress - Now Discontinued. 
e Adobe PageMill - Now Discontinued. Replaced by Adobe GoLive. 
e Microsoft FrontPage - Now Discontinued. Replaced by Microsoft Expression Web 


Designer 

e Mozilla Composer -No longer being updated or supported. Replaced by Seamonkey 
Composer 

e Netscape Composer -No longer being updated or supported. Replaced by Seamonkey 
Composer 


See also 


e HTML editors 
Adobe Dreamweaver 
Developer: 
Adobe Systems (formerly Macromedia) 


Latest release: 


CS3 (9.0) / March 27, 2007 


OS: 


Mac OS X, Microsoft Windows 


Genre: 


HTML editor 


License: 


Closed source 


305 


NICOLAE SFETCU: WEB DESIGN & DEVELOPMENT 


Website: 


www.adobe.com/products/dreamweaver/ 


Adobe Dreamweaver, or simply Dreamweaver, is a web development tool originally 
created by Macromedia (now Adobe Systems). Initial versions of the application served as 
simple WYSIWYG HTML editors but more recent versions have incorporated notable support 
for many other web technologies such as CSS, JavaScript, and various server-side scripting 
frameworks. [1] The software is available for both the Mac and Windows platforms. 
Dreamweaver is currently owned by Adobe Systems, which acquired Macromedia in 2005. 


Features 


As a WYSIWYG editor, Dreamweaver can hide the details of pages’ HTML code from the 
user, making it possible for non-coders to create web pages and sites. A professional criticism of 
this approach is that it produces HTML pages whose file size and amount of HTML code is 
much larger than they should be, which can cause web browsers to perform poorly. This can be 
particularly true because the application makes it very easy to create table-based layouts. In 
addition, some web site developers have criticized Dreamweaver in the past for producing code 
that often does not comply with W3C standards though this has improved considerably in recent 
versions. Dreamweaver 8.0 (the version prior to the recently released 9.0 within CS3) performed 
poorly on the Acid2 Test, developed by the Web Standards Project. However, Macromedia has 
increased the support for CSS and other ways to lay out a page without tables in later versions of 
the application, with the ability to convert tables to layers and vice versa. 

Dreamweaver allows users to preview websites in many browsers, provided that they are 
installed on their computer. It also has some site management tools, such as the ability to find 
and replace lines of text or code by whatever parameters specified across the entire site, and a 
templatization feature for creating multiple pages with similar structures. The behaviors panel 
also enables use of basic JavaScript without any coding knowledge. 

With the advent of version MX, Macromedia incorporated dynamic content creation tools 
into Dreamweaver. In the spirit of HTML WYSIWYG tools, it allows users to connect to 
databases (such as MySQL and Microsoft Access) to filter and display content using scripting 
technologies such as Active Server Pages (ASP), ASP.NET, ColdFusion, JavaServer Pages 
(JSP), PHP, and more without any previous programming experience. 

Dreamweaver 8.0 also included support for WYSIWYG XSLT editing, with CS3 extending 
on this functionality to introduce support for converting static HTML to XSLT and performing 
fragmented XSL transformations on the server-side. 

Dreamweaver can use "Extensions" - small programs, which any web developer can write 
(usually in HTML and JavaScript). Extensions provide added functionality to the software for 
whoever wants to download and install them. Dreamweaver is supported by a large community 
of extension developers who make extensions available (both commercial and free) for most web 
development tasks from simple rollover effects to full-featured shopping carts. 


Syntax highlighting 
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Dreamweaver MX 2004 running on Mac OS X v10.4 Tiger 


As of version 8, Dreamweaver supports syntax highlighting for the following languages out 


of the box: 

e ActionScript 

e Active Server 
ASP.NET 
C# 

e Cascading Style Sheets (CSS) 

e ColdFusion 
EDML 
Extensible HyperText Markup 
Extensible Markup 
Extensible Stylesheet Language 
Java 
JavaScript 
JavaServer Pages 


Pages 


Language 
Language 
Transformations 


PHP: Hypertext Preprocessor 


Visual Basic 
Visual Basic Script 
Wireless Markup Language (WML) 


It is also possible to add your own language syntax highlighting to its repertoire. 


Versions 


Edition 


(ASP) 


(XHTML) 
(XML) 
(XSLT) 


USP) 
(PHP) 
(VB) 
(VBScript) 


e Dreamweaver 1.0 (Released December 1997; Dreamweaver 1.2 followed in 


March 1998) 
e Dreamweaver 2.0 (Released December 1998) 
e Dreamweaver 3.0 (Released December 1999) 
e Dreamweaver UltraDev 1.0 (Released June 2000) 
e Dreamweaver 4.0 (Released December 2000) 


e Dreamweaver UltraDev 4.0 (Released December 2000) 
e Dreamweaver MxX [Internal version number: 6.0] (Released May 2002) 
e Dreamweaver MX 2004 (Released September 10, 2003) 


e Dreamweaver 8 (Released September 13, 2005) 
e Dreamweaver CS3 (Released April 16, 2007) 


See also 


e List of HTML editors 
e Web application 
e Web design 


EditPlus 
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Developer: 
ES-Computing 
Latest release: 


2.31 / 6 April 2007 


OS: 


Microsoft Windows 


License: 


Shareware 


Website: 
EditPlus 


EditPlus is 32-bit text editor for the Microsoft Windows operating system, developed by 
Sangil Kim of ES-Computing. The editor contains tools for programmers, including syntax 
highlighting (and support for custom syntax files), file type conversions, line ending conversion 
(between Linux, Windows and Mac styles), regular expressions for search-and-replace, 
keystroke recording, spell check, full support for Unicode editing, customizable keyboard 
shortcuts, auto-completion and more. Files can be browsed and edited in tabs, and an internal file 
browser is implemented in the software. 

The first version of EditPlus was released on 20 March 1998; as of July 2007, the latest 
version of EditPlus is v2.31." 

A typical installation takes about 3 MB of disk space. EditPlus is released with a shareware 
license, and the current price is 30 USD. 


Reviews 
EditPlus has been given high ranks and has won Best Software awards on several web sites 


related to sharewares In a review, Steve Jones has described the advantages of EditPlus, and the 
areas in which it should get enhanced. 


Microsoft FrontPage 


Developer: 


Microsoft 
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Latest release: 


2003 / October 2003 


OS: 


Microsoft Windows, Mac OS 


Status: 


Discontinued 


Genre: 


Web authoring tool 


License: 


Proprietary EULA 


Website: 


The Next Generation - Microsoft Office Online 


Microsoft FrontPage (later full name Microsoft Office FrontPage) was a WYSIWYG 
"HTML" editor and web site administration tool from Microsoft for the Microsoft Windows line 
of operating systems. It was part of Microsoft Office application suite from 1997 to 2003. A 
Macintosh version was also released in 1998. Microsoft FrontPage has since been replaced by 
Microsoft Expression Web and Sharepoint Designer, which were released in December 2006. 

One of the notable features of FrontPage is its built-in support for automated web templates. 
The main distinction between these templates and HTML templates generated by other products 
is that FrontPage templates include an automatic navigation system that creates animated buttons 
for pages that have been added by the user. It also creates a multi-level navigation system on the 
fly using the buttons and the structure of the web site. 


History 


FrontPage was initially created by the Cambridge, Massachusetts company Vermeer 
Technologies Incorporated, evidence of which can be easily spotted in filenames and directories 
prefixed _vti_ in web sites created using FrontPage. Vermeer was acquired by Microsoft in 1996 
specifically so that Microsoft could add FrontPage to its product line-up." 

As a WYSIWYG editor, FrontPage is designed to hide the details of pages' HTML code from 
the user, making it possible for novices to easily create web pages and sites. 

FrontPage's initial outing under the Microsoft name came in 1996 with the release of 
Windows NT 4.0 Server and its constituent HTTPd server Internet Information Services 2.0. 
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Bundled on CD with the NT 4.0 Server release, FrontPage 1.1 would run under NT 4.0 (Server 
or Workstation) or Windows 95, and was aimed at providing server administrators with a tool to 
deliver rich web and intranet content in a package as easy to use as Microsoft Word. 

FrontPage used to require a set of server-side plugins originally known as IIS Extensions. 
The extension set was significantly enhanced for Microsoft inclusion of FrontPage into the 
Microsoft Office line-up with the 97 release and subsequently renamed FrontPage Server 
Extensions (FPSE). Both sets of extensions needed to be installed on the target web server for its 
content and publishing features to work. Microsoft offered both Windows and Unix-based 
versions of FPSE. However, newer versions of FrontPage also support the standard WebDAV 
protocol for remote web publishing and authoring.”! 

A version for Mac OS was released in 1998; however, it had fewer features than the 
Windows product and Microsoft has not updated it since."! 

In 2006, Microsoft announced that FrontPage would eventually be superseded by two 
products. Microsoft SharePoint Designer will allow business professionals to design SharePoint- 
based applications. Microsoft Expression Web is targeted for web design professionals who 
create full-blown web sites. Both are partially based on FrontPage. Microsoft announced that 
they will be discontinuing Microsoft FrontPage by December 2006. 


Features 


Some features that are part of the last version of FrontPage include: 

e Help navigating through your site, and seeing your file structure, visually 

e Built-in features for HTML, CSS, and Java / JavaScript (partial) 

e Built in image editor (MS Image Composer) 

e Point-and-click functionality for common tools, like mouseovers, e-mail forms, 
and hit counts 

e Simple to use with previous knowledge of Office products 

e Integrated data display with Office products like Access and Excel 

e Support for CSS-based themes (like ASP.NET master pages) 

e When you change the URL of a page, all the links to that page are dynamically 
changed 

e Task-assignment for team projects 

e Content is editable from anywhere with FrontPage (password is needed) 

e Support for rich clipboard data import (i.e. copy/pasting data from Internet 
Explorer into FrontPage 2003 will automatically download media resources 
such as images and save them locally) 


Criticism 


Some criticism of FrontPage include: 
e In previous versions, the WYSIWYG mode tended to generate non-validating 
HTML, resulting in pages that were optimized for Internet Explorer. However, 
FrontPage 2003 is capable of generating valid XHTML if the author requires it. 
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In some cases, HTML that is manually changed in FrontPage's code view can 
revert back to incorrectly generated markup after making even slight changes in 
WYSIWYG mode. 

Because of the way it manages content as live resources, FrontPage is generally 
not well suited for administering medium to large corporate websites. 


The final version of FrontPage is Microsoft Office FrontPage 2003. The company has 


introduced 


two new products to replace Frontpage called Microsoft Expression Web and 


Microsoft Office SharePoint Designer. Previous versions include: 


See also 


Vermeer FrontPage 1.0 

1995 Microsoft FrontPage 1.1 

1997 Microsoft FrontPage 97 (version 2) 

1997 Microsoft FrontPage Express 2.0 (free stripped-down version came with 
Internet Explorer, and could be found online from numerous "download" 
repositoriestIl5) 

1998 Microsoft FrontPage for Macintosh 1.0 

1998 Microsoft FrontPage 98 (version 3) 

1999 Microsoft FrontPage 2000 (version 9) included in some Office 2000 
editions 

2001 Microsoft FrontPage 2002 (version 10) 

Microsoft Office FrontPage 2003 (version 11) 

Notice: There is no official version 4 to 8, because after Frontpage was included in 
some Office editions, the Frontpage version numbers followed their Office version 
numbers. Nonetheless, version numbers may appear in the meta tags of HTML code 
generated by these versions of Frontpage. 


List of HTML editors 
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* as add-on or plug-in to the main package 
** XAMPP bundles for different OS slightly different packages, this list refers to XAMPP 
Linux 
**K = This package was built with LightTPD, instead of Apache HTTPd. 
*K WOS provides lots of other applications as packages to download such as TYPO3, 
Wordpress, Joomla, Moodle and much more. 

Please note that some of these WAMP Packages have plugins enabling you to use some 
of the technologies not prepacked with the major release. Check the websites for more 
information. 
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Web security exploits 


Browser exploit | Cross-site cooking | Cross-site request forgery | Cross-site scripting | Cross- 
zone scripting | Directory traversal | Evil twin (wireless networks) | HTTP response splitting | 
IDN homograph attack | Referer spoofing | Session fixation | Session poisoning | Website 
spoofing 


Browser exploit 


A browser exploit is a short piece of code that exploits a software bug in a web browser 
such that the code makes the browser do something unexpected, including crash, read or write 
local files, propagate a virus or install spyware. Malicious code may exploit HTML, JavaScript, 
Images, ActiveX, Java and other internet technologies. HTML alone is harmless (can only crash 
browser in some cases on vulnerable web browsers), however, in conjunction with malicious 
ActiveX or Java code, it can potentially freeze or crash a browser, or even crash the computer 
running that browser. 

The term "browser exploit" can also refer to the actual bug in the browser code. 


Browser exploits families 


Cross Zone Scripting exploits vulnerabilities related to the "zone" concept in some browsers; 
i.e. a page in "Internet zone" is able to initate execution with "Local Computer", "Local Intranet" 
or "Trusted Sites" zone privileges. 


Cross-site cooking 


In cross-site cooking, the attacker exploits a browser bug to send an invalid cookie to a 
server. 

Cross-site cooking is a type of browser exploit which allows a site attacker to set a cookie 
for a browser into the cookie domain of another site server. 

Cross-site cooking can be used to perform session fixation attacks, as a malicious site can 
fixate the session identifier cookie of another site. 

Other attack scenarios may also possible, for example: attacker may know of a security 
vulnerability in server, which is exploitable using a cookie. But if this security vulnerability 
requires e.g. an administrator password which attacker does not know, cross-site cooking could 
be used to fool innocent users to unintentionally perform the attack. 

Cross site. Cross-site cooking is similar in concept to cross-site scripting, cross-site request 
forgery, cross-site tracing, cross-zone scripting etc., in which that it involves the ability to move 
data or code between different web sites (or in some cases, between e-mail / instant messages 
and sites). These problems are linked to the fact that web browser is a shared platform for 
different information / applications / sites. Only logical security boundaries maintained by 
browsers ensures that one site cannot corrupt or steal data from another. However a browser 
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exploit such as cross-site cooking can be used to move things across the logical security 
boundaries. 


Origins 


The name cross-site cooking and concept was not coined by Michal Zalewski in 2006. It was 
in use much earlier. The name is a mix of cookie and cross-site, attempting to describe the nature 
of cookies being set across sites. 

In Michal Zalewski's article of 2006, Benjamin Franz was credited for his discovery, who in 
May 1998 reported a cookie domain related vulnerability to vendors. Benjamin Franz published 
the vulnerability and discussed it mainly as a way to circumvent "privacy protection" 
mechanisms in popular browsers. Michal Zalewski concluded that the bug, 8 years later, was still 
present (unresolved) in some browsers and could be exploited for cross-site cooking. Various 
remarks such as "vendors [...] certainly are not in a hurry to fix this" was made by Zalewski and 
others. 


Cross-site request forgery 


Cross-site request forgery, also known as one click attack or session riding and 
abbreviated as CSRF (Sea-Surf) or XSRF, is a kind of malicious exploit of websites. Although 
this type of attack has similarities to cross-site scripting (XSS), cross-site scripting requires the 
attacker to inject unauthorized code into a website, while cross-site request forgery merely 
transmits unauthorized commands from a user the website trusts. 


Example and characteristics 


The attack works by including a link or script in a page that accesses a site to which the user 
is known to have authenticated. For example, one user, Bob, might be browsing a chat forum 
where another user, Alice, has posted a message with an image that links to Bob's bank. Suppose 
that, as the URL for the image tag, Alice has crafted a URL that submits a withdrawal form on 
Bob's bank's website. If Bob's bank keeps his authentication information in a cookie, and if the 
cookie hasn't expired, then Bob's browser's attempt to load the image will submit the withdrawal 
form with his cookie, thus authorizing a transaction without Bob's approval. 

A cross-site request forgery is a confused deputy attack against a Web browser. The deputy 
in the bank example is Bob's Web browser which is confused into misusing Bob's authority at 
Alice's direction. 

The following characteristics are common to CSRF: 

e Involve sites that rely on a user's identity 

e Exploit the site's trust in that identity 

e Trick the user's browser into sending HTTP requests to a target site 
e Involve HTTP requests that have side effects 

At risk are web applications that perform actions based on input from trusted and 
authenticated users without requiring the user to authorize the specific action. A user that is 
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authenticated by a cookie saved in his web browser could unknowingly send an HTTP request to 
a site that trusts him and thereby cause an unwanted action. 

CSREF attacks using images are often made from Internet forums, where users are allowed to 
post images but not JavaScript. 

Here is an example of an attack on Digg. Here is another example of an attack on 
Amazon.com, and one on Google's Adsense. 


Prevention 


For the web site, switching from a persistent authentication method (e.g. a cookie or HTTP 
authentication) to a transient authentication method (e.g. a hidden field provided on every form) 
will help prevent these attacks. A similar approach is to include a secret, user-specific token in 
forms that is verified in addition to the cookie. 

An alternate method is to "double submit" cookies. This method only works with Ajax based 
requests, but it can be applied as a global fix without needing to alter a large number of forms. If 
an authentication cookie is read using JavaScript before the post is made, the stricter (and more 
correct) cross-domain rules will be applied. If the server requires requests to contain the value of 
the authentication cookie in the body of POST requests or the URL of GET requests, then the 
request must have come from a trusted domain, since other domains are unable to read cookies 
from the trusting domain. On the other hand, this method forces users to enable JavaScript, 
negating the only way a user has to prevent most Cross-site scripting vulnerabilities from being 
exploited. 

Contrary to popular belief, using POST instead of GET does not offer sufficient protection. 
JavaScript can be used to forge POST requests with ease. Nonetheless, requests that cause side 
effects should always use POST." Also, POST does not leave a trail of variable data in 
webserver and proxy server logs, while GET does leave such a trail. It is still clearly best to use 
POST when coding in a defense-in-depth approach. 

Another approach is to check the HTTP Referer header to see if the request is coming from 
an authorized page. Unfortunately, this technique may not work reliably, since browsers 
sometimes omit the Referer header because of a user's privacy settings or because the Referer is 
a secure https page. 

Although cross-site request forgery defenses typically require modifying the web application, 
individual users can help protect their accounts at poorly designed sites by logging off the site 
before visiting another, or clearing their browser's cookies at the end of each browser session."”! 


Effects 


This attack relies on a few assumptions: 
e The attacker has knowledge of sites the victim has current authentication on 
(more common on web forums, where this attack is most common) 
e The attacker's "target site" has persistent authentication cookies, or the victim 
has a current session cookie 
e The "target site" doesn't have secondary authentication for actions 
While having potential for harm, the effect is mitigated by the attackers need to "know his 
audience" such that he attacks a small familiar community of victims, or a more common "target 


317 


NICOLAE SFETCU: WEB DESIGN & DEVELOPMENT 


site" has poorly implemented authentication systems (for instance, if a common book reseller 
offers ‘instant’ purchases without re-authentication). 


Cross-zone scripting 


Cross-zone scripting is a browser exploit taking advantage of a vulnerability within a zone- 
based security solution. The attack allows content (scripts) in unprivileged zones to be executed 
with the permissions of a privileged zone - i.e. a privilege escalation within the client (web 
browser) executing the script. The vulnerability could be: 

e a web browser bug which under some conditions allows content (scripts) in 
one zone to be executed with the permissions of a higher privileged zone. 

e aweb browser configuration error; unsafe sites listed in privileged zones. 

e across-site scripting vulnerability within a privileged zone 

A common attack scenario involves two steps. The first step is to use a Cross Zone Scripting 
vulnerability to get scripts executed within a privileged zone. To complete the attack, then 
perform malicious actions on the computer using insecure ActiveX components. 

This type of vulnerability has been exploited to silently install various malware (such as 
spyware, remote control software, worms and such) onto computers browsing a malicious web 


page. 
Origins of the zone concept 


Internet Explorer introduced a security zone concept into Internet Explorer. However, this is 
a generic issue which is not Internet Explorer specific; some other browsers also implicitly 
implement the Local Computer zone. 

There are four well known zones in Internet Explorer: 

e Internet. The default zone. Everything which does not belong to other zones. 

e Local intranet. 

e Trusted sites. Usually used to list trusted sites which are allowed to execute 
with insane security permissions (e.g. run unsafe and unsigned ActiveX objects). 

e Restricted sites. 

These zones are explained in detail by Q174360: How to use security zones in Internet 
Explorer. 
There is also an additional hidden zone: 

e Local Computer zone (or My Computer zone). This zone is particularly 
interesting because it can access files on the local computer. Historically this 
zone has been extremely insecure, but in recent versions Internet Explorer (for 
Windows XP) steps have been taken to reduce risks associated with zone. 

Local intranet, Trusted sites and Local Computer are usually configured to be privileged 
zones. Most Cross Zone Scripting attacks are designed to jump from Internet zone to a 
privileged zone. 


Cross-zone scripting examples 
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Cross-zone scripting into Local Computer Zone 


This type of exploits attempts to execute code in the security context of Local Computer 
Zone. 
The following HTML is used to illustrate a naive (non-working) attempt of exploitation: 


<HTML> <IMG SRC="attack.gif"> <SCRIPT SRC="file://C:\Documents and 
Settings\Administrator\ Local Settings\Temporary Internet Files\attack.gif> </HTML> 


Explanation: the HTML code attempts to get attack.gif loaded into the cache by using an 
IMG SRC reference. Then a SCRIPT SRC tag is then used to attempt executing the script from 
the Local Computer Zone by addressing the local file in cache. 


Cross-zone scripting into Local Intranet Zone 


Consider this scenario 
e an attacker could (somehow) know of a cross-site scripting vulnerability in on 
http://intranet.example.com/xss.php 
e a lot of  http://intranet.example.com users regularly visit 
http://www.example.com/, where anyone can add Cool links. 
e Attacker adds a Cool link to: 


http://intranet.example.com/xss.php?<script>alert()</script> 


A computer which considers intranet.example.com a part of Local Intranet zone will now 
successfully be cross zone scripted. 


Cross-zone scripting into Trusted Sites Zone 


A well known example is the %2f bug in Internet Explorer. It was discovered that the 
following URL 


http://windowsupdate.microsoft.com%2f.example.com/ 


executed with "Trusted Sites" permission if windowsupdate.microsoft.com was listed as a 
trusted site. 


Directory traversal 


A directory traversal is to exploit insufficient security validation / sanitization of user- 
supplied input file names, so that characters representing "traverse to parent directory" is passed 
through to the file APIs. 
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The goal of this attack is to order an application to access a computer file that is not intended 
to be accessible. This attack exploits a lack of security (the software is acting exactly as it is 
supposed to) as opposed to exploiting a bug in the code. 

Directory traversal is also known as the ../ (dot dot slash) attack, directory climbing, and 
backtracking. Some forms of this attack are also canonicalization attacks. 


Example 
A typical example of vulnerable application code is: 


<?php Stemplate = 'blue.php’; if ( is_set( $_COOKIE['TEMPLATE'] ) ) Stemplate = 
S_COOKIE['TEMPLATE']; include ( '"/home/users/phpguru/templates/" . Stemplate ); ?> 


An attack against this system could be to send the following HTTP request: 


GET /vulnerable.php HTTP/1.0 Cookie: TEMPLATE=../../../../../../../../../etc/passwd 


Generating a server response such as: 


HTTP/1.0 200 OK Content-Type: text/html Server: Apache root:fi3sED95ibqR6:0:1:System 
Operator:/:/bin/ksh daemon:*:1:1::/tmp: 
phpguru:f8fk3j1Olf31.:182:100:Developer:/home/users/phpguru/:/bin/csh 


The repeated ../ characters after /home/users/phpguru/templates/ has caused include() to 
traverse to the root directory, and then include the UNIX password file /etc/passwd. 

UNIX /etc/passwd is a common file used to demonstrate directory traversal, as it is often 
used by crackers to try cracking the passwords. 


Variations of directory traversal 


Directory traversal is trickier to prevent than it might seem. A "filter out known bad 
characters" protection strategy is likely to fail. 

There are many other factors involved that would determine whether a directory traversal 
would actually work. However, if the application does not validate the legitimacy of such 
parameters, it is quite likely that attackers may have some wiggle room to exploit this 
functionality for malicious purposes. 

Listed below are some known directory traversal attack strings: 


Directory traversal on UNIX 


Common Unix-like directory traversal uses the ../ characters. 
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Directory traversal on Microsoft Windows 


Microsoft Windows or DOS directory traversal uses the ..\ characters. 

Today, many Windows programs or APIs also accept UNIX-like directory traversal 
characters. 

Each partition has a separate root directory (labeled C:\ for a particular partition C) and there 
is nO common root directory above that. This means that for most directory vulnerabilities on 
Windows, the attack is limited to a single partition. 


URI encoded directory traversal 


Canonicalization problem. 
Some web applications scan query string for dangerous characters such as: 
° .. 
e ..\ 
e ./ 
to prevent directory traversal. However, the query string is usually URI decoded before use. 
Therefore these applications are vulnerable to percent encoded directory traversal such as: 
e %2e%2e%2f which translates to ../ 
e %2e%2e/ which translates to ../ 
e ..%2f which translates to ../ 
e %2e%2e%5c which translates to ..\ 
etc. 


Unicode / UTF-8 encoded directory traversal 


Canonicalization problem. 

UTF-8 was noted as a source of vulnerabilities and attack vectors in Cryptogram Newsletter 
July 2000 by Bruce Schneier and Jeffrey Streifling. 

When Microsoft added unicode support to their Web server, a new way of encoding ../ was 
introduced into their code, causing their attempts at directory traversal prevention to be 
circumvented. 

Multiple percent encodings, such as 

e %c1%1c 
e %c0%9v 
© %cO%af 

translated into / or \ characters. 

Why? Percent encodings were decoded into the corresponding 8-bit characters by Microsoft 
webserver. This has historically been correct behavior as Windows and DOS traditionally used 
canonical 8-bit characters sets based upon of ASCII. 

However, the original UTF-8 was not canonical, and several strings were now string 
encodings translatable into the same string. Microsoft performed the anti-traversal checks 
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without UTF-8 canonicalization, and therefore not noticing that (HEX) COAF and (HEX) 2F 
were the same character when doing string comparisons. 


Possible Methods to Prevent Directory Traversal 


A possible algorithm for preventing directory traversal would be to: 

e Process URI requests that do not result in a file request, e.g, executing a hook 
into user code, before continuing below. 

e When a URI request for a file/directory is to be made, build a full path to the 
file/directory if it exists, and normalize all characters (e.g, %20 converted to 
spaces). 

e Itis assumed that a ‘Document Root' fully qualified, normalized, path is known, 
and this string has a length N. Assume that no files outside this directory can be 
served. 

e Ensure that the first N characters of the fully qualified path to the requested file 
is exactly the same as the 'Document Root’. 

e Ifso, allow the file to be returned. 

e If not, return an error, since the request is clearly out of bounds from what the 
web-server should be allowed to serve. 


Evil twin (wireless networks) 


Evil Twin is a term for a rogue Wi-Fi access point that appears to be a legitimate one offered 
on the premises, but actually has been set up by a hacker to eavesdrop on wireless 
communications among Internet surfers. "! 

This type of Evil Twin Attack may be used by a hacker to steal the passwords of 
unsuspecting users by either snooping the communication link or by phishing, which involves 
setting up a fraudulent Web site and luring people there."”! 

A rogue Wi-Fi connection can be set up on a laptop with a bit of simple programming and a 
special USB (Universal Serial Bus) thumb drive that acts as an access point. The access points 
are hard to trace, since they can suddenly be shut off, and are easy to build. A hacker can make 
their own wireless networks that appear to be legitimate by simply giving their access point a 
similar name to the Wi-Fi network on the premises. Since the hacker may be physically closer to 
the victim than the real access point, their signal will be stronger, potentially drawing more 
victims. The hacker's computer can be configured to pass the person through to the legitimate 
access point while monitoring the traffic of the victim, or it can simply say the system is 
temporarily unavailable after obtaining a user id and password."! 

Several free programs available on the Internet can decode packets to reveal clear-text logins 
and passwords. Using an Evil Twin attack a hacker is able to harvest Web applications such as 
email that could send passwords in clear text. 

One way that Corporate users can protect themselves from an Evil Twin attack is by using 
VPN (virtual private network) when logging into company servers. They can also ask the 
wireless provider to provide the SSID, which is the exact name of the wireless network.They 
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should not send sensitive information such as bank account information or corporate user ids and 
passwords over a wireless network."! 
Hackers typically setup Evil twin attacks near free hotspots, such as airports, cafes, hotels or 
libraries") 
Reference 
1. “ "Strange Wi-Fi spots may harbor hackers: ID thieves may lurk behind a hot 
spot with a friendly name." Andrew D. Smith. The Dallas Morning News. Knight 
Ridder Tribune Business News. Washington: May 9, 2007. pg. 1 Source type: 
Wire Feed ProQuest document ID: 1267536891 Text Word Count 766 
Document URL: [1] (subscription). Retrieved June 6, 2007 
2. * "Security Watch. Daniel Wolfe. American Banker. New York, N.Y.: Feb 14, 
2007. Vol.172, Iss. 31; pg. 7. (A security firm used an Evil Twin as a test to 
obtain passwords from attendees at RSA security conference). Source type: 
Newspaper ISSN: 00027561 ProQuest document ID: 1219496681 Text Word 
Count 1097 Document URL: [2] (subscription). Retrieved June 6, 2007 
3. *“ "Computer Column." Craig Crossman. Knight Ridder Tribune Business News. 
Washington: Aug 24, 2005. pg. 1. Source type: Wire Feed ProQuest document 
ID: 886418531 Text Word Count 761 Document URL: [3] (Subscription). 
retrieved June 6, 2007 
4. “ Attorney General Madigan warns computer users about 'Evil twin' attacks at 
wireless hotspots. US Fed News Service, Including US State News. Washington, 
D.C.: Jan 17, 2006. News release by Illinois Attorney General. Source type: Wire 
Feed ProQuest document ID: 975720601 Text Word Count 471 Document URL: 
[4] (Subscription). retrieved June 6, 2007. 
5. “ Access Without Authentification: how and why we let anyone surf our 
wireless."Donna Watkins. Computers in Libraries. Westport: Mar 2006. Vol.26, 
Iss. 3; pg. 10, 5 pgs. Source type: Periodical ISSN: 10417915 ProQuest document 
ID: 1000365471 Text Word Count 2618 Document URL: [5] (subscription). 
retrieved June 6, 2007 
e Jeremy Kirk, 'Evil twin' Hotspots Proliferate’, IDG News Service, Apr 25, 2007 


HTTP response splitting 


HTTP response splitting is a form of web application vulnerability, resulting from the 
failure of the application or its environment to properly sanitize input values. It can be used to 
perform cross-site scripting attacks, cross-user defacement, Web cache poisoning, and similar 
exploits. 

The attack consists of making the server print a carriage return (CR, ASCII OxOD) line feed 
(LF, ASCH 0x0A) sequence followed by content supplied by the attacker in the header section of 
its response, typically by including them in input fields sent to the application. Per the HTTP 
standard (RFC 2616), headers are separated by one CRLF and the response's headers are 
separated from its body by two. Therefore, the failure to remove CRs and LFs allows the attacker 
to set arbitrary headers, take control of the body, or break the response into two or more separate 
responses (hence the name). 


323 


NICOLAE SFETCU: WEB DESIGN & DEVELOPMENT 


Example 
Code at risk 


In its simplest form consider a PHP redirect on page redir.php: 


<? header("Location: http://example.tld/goto.php?id=".$ GET['id'] ); ?> 


This adds a Location header to the HTTP response. $_GETT['id'] is replaced with the "id" field 
from the query string, so a request like: 


http://any.server.net/redir.php?id=send_me_here 


will include "send_me_here" in the response: 


HTTP/1.1 302 Date: something Location: http://example.tld/goto.php?id=send_me_here 
Timeout: something Content-Type: text/html 


The attack 


An attacker may want to change the cookie a target is given for a website, possibly as part of 
a session fixation attack. This can be done by including the following header: 


Set-Cookie: some=value 


The attacker can send their target to the following URL: 


http://example.tld/redir.php?id=%0d%0aSet-Cookie%3A+some%3Dvalue 


The id field, "%0d%0aSet-Cookie%3A+some%3Dvalue", will be decoded to produce CRLF 
"Set-Cookie: some=value". This string is then appended to the Location header: 


HTTP/1.1 302 Date: something Location: http://example.tld/goto.php?id= Set-Cookie: 
some=value Timeout: something Content-Type: text/html 


Prevention 


The generic solution is to URL-encode strings before inclusion into HTTP headers such as 
Location or Set-Cookie. 
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The example's code could be protected from this attack by sanitizing $_GETT['id']. Typical 
examples of sanitization include casting to integer, or aggressive regular expression replacement. 
It is worth noting that although this is not a PHP specific problem, the PHP interpreter contains 
protection against this attack since version 4.4.2 and 5.1.2 [1]. 


IDN homograph attack 


The internationalized domain name (IDN) homograph attack is a means by which a 
malicious party may seek to deceive computer users about what remote system they are 
communicating with, by exploiting the fact that many different characters may have nearly (or 
wholly) indistinguishable glyphs. 


Homographs 


In multilingual computer systems, different logical characters may have identical or very 
similar appearances. For example, Unicode character U+0430, Cyrillic small letter a ("0"), can 
look identical to Unicode character U+0061, Latin small letter a, ("a") which is the lowercase "a" 
used in English. Technically, characters that look alike in this way are known as homoglyphs (a 
subgroup of homographs). Spoofing attacks based on these similarities are known as 
homograph spoofing attacks. 

The problem arises from the different treatment of the characters in the users mind and the 
computer's programming. From the viewpoint of the user, a Cyrillic "0" within a Latin string is a 
Latin "a"; there is no difference in the glyphs for these characters in most fonts. However, the 
computer treats them differently when processing the character string as an identifier. Thus, the 
user's assumption of a one-to-one correspondence between the visual appearance of a name, and 
the named entity, breaks down. 

In a typical example of a hypothetical attack, someone could register a domain name that 
appears identical to an existing domain but goes somewhere else. For example, the spoofed 
domain "pOypal.com" contains a Cyrillic a, not a Latin a. In many ways, this is not a new thing. 
For example, even staying within the old character set of A-Z, 0-9 and hyphen, GOOGLE.COM 
looks much like GOOGLE.COM in some fonts; or, using a mix of uppercase and lowercase 
characters, google.com (capital J, not small ell) looks much like google.com in some fonts. 
PayPal itself was a target of a phishing scam exploiting this, using the domain PayPal.com Or, 
displaying characters in lowercase alone, rnozilla.org ("RNOZILLA.ORG") looks very much 
like mozilla.org in many fonts. What is new was that the expansion by the internationalized 
domain name system of the character repertoire from a few dozen characters in a single alphabet 
to many thousands of characters in many scripts greatly increased the scope for homograph 
attacks. 


Homographs in internationalized domain names 


The limitation of domain names to ASCII characters may not last forever, and is coming 
under pressure from organizations based in regions that do not use Latin characters. 
Internationalized domain names provides a backward-compatible way for domain names to use 
the full Unicode character set, and this standard is already widely supported. 
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For example, the Russian newspaper website gazeta.ru may wish to use the URL 
3075B0.@C, reflecting the newspaper's name spelled in Cyrillic. The disadvantage in this 
example is that the Cyrillic letters '0', '5','@', 'C' are indistinguishable in writing from their Latin 
counterparts. Some of the letters (such as a) are close etymologically, while others look similar 
by coincidence. For instance, the Cyrillic letter '@' represents a phoneme similar to the English 
'r’, but the glyph is identical to the Latin letter 'p’. 

This opens a rich vein of opportunities for phishing and other varieties of fraud. An attacker 
could register a domain name that looks just like that of a legitimate website, but in which some 
of the letters have been replaced by homographs in another alphabet. The attacker could then 
send e-mail messages purporting to come from the original site, but directing people to the bogus 
site. The spoof site could then record information such as passwords or account details, while 
passing traffic through to the real site. The victims may never notice the difference, until 
suspicious or criminal activity occurs with their accounts. 


Defending against the attack 


The simplest defense is for web browsers not to support IDNA or other similar mechanisms, 
or for users to turn off whatever support their browsers have. That could mean blocking access to 
IDNA sites, but generally browsers permit access and just display IDNs in Punycode. Either 
way, this amounts to abandoning non-ASCII domain names. 

Firefox and Opera display punycode for IDNs unless the top-level domain (TLD, for 
example, .ac or .museum) prevents homograph attacks by restricting which characters can be 
used in domain names."! They both also allow users to manually add TLDs to the allowed list.) 

Internet Explorer 7 allows IDNs except for labels that mix scripts for different languages. 
Labels that mix scripts are displayed in punycode. There are exceptions to locales where ASCII 
characters are commonly mixed with localized scripts."! 

As an additional defense, Internet Explorer 7, Firefox 2.0 and Opera 9.10 include phishing 
filters to alert users when they visit malicious websites. !!s1!7! 

Another possible defense would be for web browsers to display non-ASCII characters in 
URLs distinctively, perhaps by changing their color or that of their background. This wouldn't 
provide protection against spoofing by changing one non-ASCII character to another similar- 
looking one (for example, replacing a Greek ¢ with a Cyrillic > or vice versa). (A solution to this 
problem would be using a different color for all character groups, but no software implements it 
that way.) This approach was adopted, as of July 9, 2005, by the plug-in Quero Toolbar for 
Internet Explorer. Besides IDN highlighting Quero has implemented several other techniques to 
mitigate IDN spoofing attacks like mixed-script/missing glyph detection, IDN/digit indication 
and "core domain" highlighting. 

There is not yet (as of March 2005) a clear consensus as to the best way to balance the needs 
of the international community with protection against domain-name spoofing. 


Referer spoofing 
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In computer security, referer spoofing or ref tar spoofing is the sending of incorrect referer 
information along with an HTTP request, with the aim of gaining unauthorized access to a web 
site. 


Application 


Some subscription sites, especially many pornographic paysites, utilize referer information to 
secure their materials: only browsers arriving from a small set of approved (login-) pages are 
given access; this facilitates the sharing of materials among a group of cooperating paysites. If 
attackers acquire knowledge of these approved referers (which is often trivial because many sites 
follow a common template), they can then gain free access to the materials. 


Tools 


Several software tools exist to facilitate referrer spoofing: 

e Proxomitron offeres the capability of referrer spoofing 

e Here is an open source project where there is a feature called "replay headers". 

e The Mozilla Firefox extension refspoof allows to use a custom referer URL for 
any site one visits, and provides a mechanism to manage a bookmark list of such 
referer/site pairs. 

e A similar Mozilla Firefox extension is spooFXplorer (not compatible with the 
most recent version of Firefox). 

e QuickSpoof and Spooph provide the same functionality for the Internet Explorer 
browser. 

e SuperMegaSpoof is a Windows application that supports both browsers and 
allows users to exchange and rate referer spoofs; it displays advertisements 
while it is running. 

e SpoofForge is similar to SuperMegaSpoof with the exception that spoofs are 
displayed on a web site, and so there is no software to install. 


Session fixation 


Session fixation attacks attempt to exploit the vulnerability of a system which allows one 
person to fixate (set) another person's session identifier (SID). 

Most session fixation attacks are web based, and most rely on session identifiers being 
accepted from URLs (query string) or POST data. 


Attack scenarios 


Alice is a nice girl who just wants to use her bank http://unsafe/. Unfortunately, Alice is not 
very security savvy. 

Mallory is out to get Alice's money from the bank. 

Alice has a reasonable level of trust in Mallory, and will visit links Mallory sends her. 
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A simple attack scenario 


Straight forward scenario: 

1. Mallory has determined that http://unsafe/ accepts any session identifier, 
accepts session identifiers from query string and has no security validation. 
http://unsafe/ is thus not secure. 

2. Mallory sends Alice an e-mail: "Hey, check this out, there is a cool new account 
summary feature on our bank, http://unsafe/?SID=I_WILL_KNOW_THE_SID". 
Mallory is trying to fixate the SID to |_WILL_KNOW_THE_SID. 

3. Alice is interested and visits http://unsafe/?SID=l_WILL_KNOW_THE_SID. The 
usual log-on screen pops up, and Alice logs on. 

4. Mallory visits http://unsafe/?SID=|_WILL_KNOW_THE_SID and now has unlimited 
access to Alice's account. 


Attack using server generated SID 


A misconception is that servers which only accept server generated session identifiers are 
safe from fixation. This is false. 
Scenario: 
1. Mallory visits http://vulnerable/ and checks which SID is returned. For example, 
the server may respond: Set-Cookie: SID=0D6441FEA4496C2. 
2. Mallory is now able to send Alice an e-mail: "Check out this new cool feature on 
our bank, http://vulnerable/?SID=0D6441FEA4496C2. 
3. Alice logs on, with fixated session identifier SID=0D6441FEA4496C2. 


Attacks using cross-site cooking 


Another session fixation attack, cross-site cooking, exploits browser vulnerabilities. This 
allows a site http://evil/ to store cookies on browser in the cookie domain of another server 
http://good/, that is trusted. In order for this attack to succeed there is no need for any 
vulnerability within http://good/, as http://good/ may assume browser cookie management is 
secure. 

Scenario: 

1. Mallory sends Alice an e-mail: "Hey, check out this cool site, http://evil/". 

2. Alice visits http://evil/, which will cookie SID with value |_WILL_KNOW_THE_SID 
into the domain of http://good/. 

3. Alice then receives an e-mail from from Mallory, "hey check out your bank 
account at http://good/". 

4. When Alice logs on Mallory may use her account using the fixated session 
identifier. 


Counter-measures 
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Do not accept session identifiers from GET / POST variables 


Session identifiers in URL (query string, GET variables) or POST variables are not 
recommended as it simplifies this attack - it is easy to make links or forms which sets GET / 
POST variables. 

Additionally, session identifiers (SID) in query string enables other risk / attack scenarios as 
well; 

e SID is leaked to others servers through the Referrer 

e SID is leaked to other people as users cut & paste "interesting links" from the 
address bar into chat, forums, communities etc. 

e SID is stored in many places (browser history log, web server log, proxy logs, ...) 


Solution: Store session identifier in HTTP cookie 


The session identifier is on most modern systems by default stored in an HTTP cookie which 
has a moderate level of security. 

However, session identifiers are often accepted from GET/POST as well on these standard 
systems. Browser configuration must be modified in order to protect against this vulnerability. 

For PHP, if you have access to PHP configuration, use this: php.ini: 


; Whether to use cookies. session.use_cookies = 1 ; This option enables administrators to make 
their users invulnerable to ; attacks which involve passing session ids in URLs; defaults to 0. 
session.use_only_cookies = 1 


Solution: Utilize SSL / TLS Session identifier 


When enabling HTTPS security, some system will allow applications to obtain the SSL / 
TLS session identifier. Use of the SSL/TLS session identifier is very secure, but many web 
development languages do not provide robust built-in functionality for this. 

SSL/TLS session identifier may only be suitable for critical applications such as large 
financial sites due to the size of the system. It is however an issue rarely debated even in security 
forums. 


Regenerate SID on each request 


A countermeasure against session fixation is to generate a new session identifier (SID) on 
each request. Thus, although attacker may trick a user into accepting a known SID, the SID will 
be invalid when attacker attempts to re-use the SID. Implementation of such a system is simple, 
as demonstrated by the following: 

e Get previous Session Identifier OLD_SID from HTTP request. 
e If OLD_SID is null, empty, or no session with SID=OLD_SID exists, create a new 
session. 
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e Generate new session identifier NEW_SID with a secure random number 
generator. 
e Let session be identified by SID=NEW_SID (and no longer by SID=OLD_SID) 
e Transmit new SID to client. 
Example: 
If Mallory successfully tricks Alice into visiting http://victim/?SID=|_KNOW_THE_SID, this 
HTTP request is sent to victim: 


GET /?SID=|_KNOW_THE_SID HTTP/1.1 Host: victim 


victim accepts SID=I_KNOW_THE_SID, which is bad. However, victim is secure because it 
performs session regeneration. victim sends the following response: 


HTTP/1.1 200 OK Cookie: SID=3134998145AB331F 


Alice will now use SID=3134998145AB331F which is unknown to Mallory, and 
SID=|_KNOW_THE_SID is invalid. Mallory is thus unsuccessful in the session fixation attempt. 

Unfortunately session regeneration is not always possible. Known cases where session 
regeneration may cause problems include when third party software such as ActiveX, Java 
Applets, or browser plugins communicate with the server as well. Third party software could 
cause logouts, or the session could be split into two separate sessions. 


Accept only server generated SID 


One way to improve security is to not accept session identifiers not generated by server. 


if (lisset(S_SESSION['SERVER_GENERATED_ SID'])) { session_destroy(); // destroy all data in 
session } session_regenerate_id(); // generate a new session identifier 
S_SESSION['SERVER_GENERATED_ SID‘ = true; 


XCV 
Logout function 


A logout function is useful as it allows users to indicate that a session should not allow 
further requests. Thus attacks can only be effective while a session is active. 


if (isset(S_GET['LOGOUT'])) session_destroy(); // destroy all data in session 


Time-out old SIDs 
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This defense is simple to implement and has the advantage of providing a measure of 
protection against unauthorized users accessing an authorized user's account by using a machine 
they may have left unattended. 

Store a session variable containing a time stamp of the last access made by that SID. When 
that SID is next used compare the current timestamp (in PHP you can get this by using the time() 
function call) with the one stored in the session. If the difference is greater than a predefined 
number, say 5 minutes, destroy the session. Otherwise, update the session variable with the 
current timestamp. 


Destroy session if Referrer is suspicious 


When visiting a page, most browsers will set the Referrer - from which page did you click to 
get to this page? 

When using a site which you are logged into, if the site should rarely be linked to (e.g., 
banking websites, webmails), and/or the session should not last too long as the user might run 
into links to the site, it is usually highly suspicious if the Referrer is not from within the site. 

For example, http://vulnerable/ may employ the following security check: 


if (strpos(S_SERVER['HTTP_REFERER'], ‘http://vulnerable/') !== 0) { session_destroy(); // destroy 
all data in session } session_regenerate_id(); // generate a new session identifier 


Verify that additional information is consistent throughout session 


One way to further improve security is to ensure that it appears to be the same end user 
(client). This makes it a bit harder to perform session fixation and other attacks. 
IP address 

As more and more network begin to conform to RFC 3704 and other anti-spoofing practices, 
IP numbers are more reliable as a "same source" identifier. Therefore, by verifying that the 
source IP is consistent throughout a session, the security of a web site can be improved. 

This could be performed in this manner: 


if(S_SERVER['REMOTE_ADDR'] != $_SESSION['PREV_REMOTEADDR']) { session_destroy(); // 
destroy all data in session } session_regenerate_id(); // generate a new session identifier 
S_SESSION['PREV_REMOTEADDR'] = $_SERVER['REMOTE_ADDR']; 


However, there are some things to consider before employing this approach. 
e Several users may share one IP. It is not uncommon that an entire building 
shares one IP using NAT. 
e One user may have an inconsistent IP. This is true for users behind some 
proxies (such as AOL customers). It is also true for some mobile / roaming 
users. 
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For some sites, the added security outweighs the lack of convenience, and, for others, it does 
not. 


User Agent 


Browsers identify themselves by "User-Agent" HTTP headers. This header does not 
normally change during use; it would be extremely suspicious if that were to happen. A web 
application might make use of User-Agent detection in attempt to prevent malicious users from 
stealing sessions. This is Security through obscurity, which has its uses, but should not be used 
as a single defence. 


if (S_SERVER['HTTP_USER_AGENT'] !== S_SESSION['PREV_USERAGENT'}) { session_destroy(); // 
destroy all data in session } session_regenerate_id(); // generate a new session identifier 
S_SESSION['PREV_USERAGENT'] = $_SERVER['HTTP_USER_AGENT']; 


Defense in Depth 


Defense in depth is to combine several countermeasures. The idea is simple: if one obstacle 
is hard to overcome, several obstacles could be very hard to overcome. 
A DiD strategy could involve: 
e Enable HTTPS (to protect against other problems) 
e Correcting configuration (do not accept external SID, set time-out etc) 
e Perform session_regeneration, support log-out, reject illegal referrers, etc. 
The following PHP script demonstrates some such countermeasures put together in a 
Defence in Depth manner: 


if (strpos(S_SERVER['HTTP_REFERER'], 'https://DiD/') !== 0) session_destroy(); if 
(isset(S_GET['LOGOUT'])) session_destroy(); if (S_ SERVER['REMOTE_ADDR’] !== 

S$ _SESSION['PREV_REMOTEADDR’]) session_destroy/(); if ($_ SERVER['HTTP_USER_AGENT'] !== 
S_SESSION['PREV_USERAGENT']) session_destroy(); session_regenerate_id(); // generate a new 
session identifier S_SESSION['PREV_USERAGENT'] = S$ SERVER['HTTP_USER_AGENT']; 
S_SESSION['PREV_REMOTEADDR'] = $_SERVER['REMOTE_ADDR']; 


See also 
e Session poisoning 
Session poisoning\ 
Session poisoning (also referred to as "Session data pollution" and "Session modification") 


is to exploit insufficient input validation in server applications which copies user input into 
session variables. 
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The underlying vulnerability is a state management problem; shared state, race condition, 
ambiguity in use or plain unprotected modifications of state values. 

Session poisoning have been demonstrated in server environments where different non- 
malicious applications (scripts) share the same session states but where usage differ, causing 
ambiguity and race conditions. 

Session poisoning have been demonstrated in scenarios where attacker is able to introduce 
malicious scripts into the server environment, which is possible if attacker and victim shares a 
web hotel. 


Origins 


Alla Bezroutchko inquired if "Session data pollution vulnerabilities in web applications" was 
a new problem in January 2006. 
This was however an old vulnerability previously noted by other; 
e "this is a classic state management issue" - Yvan Boily 
e "This is not new" - /someone 
Googling for this issue does find hits from 
e July 2001 Serious security hole in Mambo Site Server version 3.0.X by Ismael 
Peinado Palomo of reverseonline.com 
e September 2005 PHP Session modification by unknow (from uw-team) and 
adam _i 
e December 2005 Experts Exchange: UserID Session variable data changing?! 
It is possible to dig up much older references, but most old issues are ungoogleable because 
no generally accepted term for this issues existed, nor was it taught/discussed in popular web 
programming FAQ's such as the PHPSEC.ORG PHP Security Guide. 


Attack examples 
Trivial attack scenario 


In Experts Exchange: UserID Session variable data changing?!, it was discussed that 


Session("Login") = Request("login") Session("Username") = Request("Username") 


was subject to trivial attacks such as 


vulnerable.asp?login=YES&Username=Mary 


Typical examples of such attacks could be if 
e User submits username / password to logon.asp 
e If password for Mary’ checks’ outs, logon.asp forwards to 
vulnerable.asp?login=YES&Username=Mary 
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I.e. the problem is that vulnerble.asp is only designed to cope with when accesses the page in 
a non-malicious way. Anyone who realizes how the script is designed, is able to craft a HTTP 
request which sets the logon user arbitrarily. 


Exploiting ambiguous or dual use of same session variable 


Alla Bezroutchko discusses a scenario where $_SESSION['login'] is used for two different 
purposes. 
e Inthe login scripts, the session variable stores "This user is logged on". 
e In the password reset scripts, the session variable stores "this user wants his 
password reset". 
A race condition was demonstrated, in which the reset scripts could be exploited to change 
the logged on user arbitrarily. 


Exploiting scripts allowing writes to arbitrary session variables 


/someone discusses examples observed in development forums, which allows writing to 
arbitrary session variables. 
The first example is 


Svar = $_GET["something"]; S_SESSION["Svar"] = Svar2; 


(in which $_GET["something"] probably is from a selection box or similar). 
Attack becomes 


vulnerable.php?something=SESSION_VAR_TO_POISON 


Session poisoning attacks enabled by php.ini: register_globals = on 

php.ini: register_globals = on is known to enable security vulnerabilities in several 
applications. PHP server administrators are recommended to disable this feature. 

Note: Real-world examples of session poisoning in enabled by register_globals = on was 
publicly demonstrated in back in July 2001 article Serious security hole in Mambo Site Server 
version 3.0.X. 

Second example by /someone is 


if (Scondition1) { Svar = 'SOMETHING'; }; if (Scondition2) { Svar = 'OTHER'; }; S_SESSION["Svar"] 
= Svar2; 


which is vulnerable if: 
e Itis possible for attacker to cause both conditions to be false. 
e php.ini is misconfigured (register_globals = on), which allows $var default value 
to be controlled by GPC (GET, POST, or COOKIE) input. 
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Attack becomes 


vulnerable.php?var=SESSION_VAR_TO_POISON 


Exploit utilizing a shared PHP server (e.g. web hotel) 


unknow of uw-team.org discusses a scenario where attacker and victim shares the same PHP 
server. 
Attack is fairly easy: 
e The attacker first visits the victim's page, and e.g. log on. 
e Attacker then uploads a PHP script to his account, and have it display context of 
$_SESSION (set by victim script). 
e Attacker determine which variable which needs to be changed, uploads a script 
which sets this variable, execute it. 
e Attacker visit victim pages and see if exploit the effect anticipated. 
This attack only requires that victim and attacker share the same PHP server. The attack is 
not dependent on victim and attacker having the same virtual hostname, as it is trivial for 
attacker to move the session identifier cookie from one cookie domain to another. 


See also 


e Session fixation 


Browser exploit | Cross-site cooking | Cross-site request forgery | Cross-site scripting | Cross- 
zone scripting | Directory traversal | Evil twin (wireless networks) | HTTP response splitting | 
IDN homograph attack | Referer spoofing | Session fixation | Session poisoning | Website 
spoofing 

This guide is licensed under the GNU Free Documentation License. It uses material from 
the Wikipedia. 


Website spoofing 


Website spoofing is the act of creating a website, as a hoax, with the intention of misleading 
readers that the website has been created by a different person or organisation. Normally, the 
website will adopt the design of the target website and sometimes has a similar URL."! 

Another technique is to use a ‘cloaked’ URL. By using domain forwarding, or inserting 
control characters, the URL can appear to be genuine while concealing the address of the actual 
website.”! 
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The objective may be fraudulent, often associated with phishing or e-mail spoofing, or to 
criticize or make fun of the person or body whose website the spoofed site purports to 
represent."! 

As an example of the use of this technique to parody an organization, in November 2006 two 
spoof websites were produced claiming that Microsoft had bought Firefox and released 
Microsoft Firefox 2007."! 
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Web syndication 


Web syndication is a form of syndication in which a section of a website is made available 
for other sites to use. This could be simply by licensing the content so that other people can use 
it; however, in general, web syndication refers to making web feeds available from a site in order 
to provide other people with a summary of the website's recently added content (for example, the 
latest news or forum posts). 

Large scale web syndication of content started in 2001 when Miniclip freely syndicated 
online browser based interactive games to the masses. Today many different types of content are 
syndicated on the Internet. Millions of online publishers including newspapers, commercial web 
sites and blogs now publish their latest news headlines, product offers or blog postings in 
standard format news feed. 

Syndication benefits both the websites providing information and the websites displaying it. 
For the receiving site, content syndication is an effective way of adding greater depth and 
immediacy of information to its pages, making it more attractive to users. For the transmitting 
site, syndication drives exposure across numerous online platforms. This generates new traffic 
for the transmitting site — making syndication a free and easy form of advertisement. 

The prevalence of web syndication is also of note to online marketers, since web surfers are 
becoming increasingly wary of providing personal information for marketing materials (such as 
signing up for a newsletter) and expect the ability to subscribe to a feed instead. 

Although the format could be anything transported over HTTP, such as HTML or JavaScript, 
it is more commonly XML. The two main families of web syndication formats are RSS and 
Atom. 


See also 


e Atom (standard) 
e RSS (file format) 
e Web feed 


News aggregators | Podcasting | Video bloggers | Web syndication formats 


News aggregators 
In computing, a feed aggregator, also known as a feed reader or simply as an aggregator, 


is client software or a Web application which aggregates syndicated web content such as news 
headlines, blogs, podcasts, and vlogs in a single location for easy viewing. 


Functions 


Aggregators reduce the time and effort needed to regularly check websites for updates, 
creating a unique information space or "personal newspaper." Once subscribed to a feed, an 
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aggregator is able to check for new content at user-determined intervals and retrieve the update. 
The content is sometimes described as being "pulled" to the subscriber, as opposed to "pushed" 
with email or IM. Unlike recipients of some "pushed" information, the aggregator user can easily 
unsubscribe from a feed. 

Aggregator features are frequently built into portal sites (such as My Yahoo! and iGoogle), 
modern Web browsers and email programs. 

The aggregator provides a consolidated view of the content in a single browser display or 
desktop application. Such applications are also referred to as RSS readers, feed readers, feed 
aggregators, news readers or search aggregators. Aggregators with podcasting capabilites can 
automatically download media files, such as MP3 recordings. In some cases, these can be 
automatically loaded onto portable media players (like iPods) when they are connected to the 
PC. 

Recently, so-called RSS-narrators have appeared, which not only aggregate text-only news 
feeds, but also convert them into audio recordings for offline listening. 

The syndicated content an aggregator will retrieve and interpret is usually supplied in the 
form of RSS or other XML-formatted data, such as RDF/XML or Atom. 


Varieties 


The variety of software applications and components that are available to collect, format, 
translate, and republish XML feeds is a testament to the flexibility of the format and has shown 
the usefulness of presentation-independent data. 


Web based 


Web based aggregators are applications that reside on remote servers and are typically 
available as Web applications such as Google Reader or Bloglines. Because the application is 
available via the Web, it can be accessed anywhere by a user with an Internet connection. 

More advanced methods of aggregating feeds are provided via AJAX coding techniques and 
XML components known as Web widgets. Ranging from full-fledged applications to small 
fragments of code that can be integrated into larger programs, they allow users to aggregate 
OPML files, email services, documents, or feeds into a single interface. Many customizable 
homepage/portal implementations such as iGoogle, Live.com, My Yahoo!, and Pageflakes 
provide such functionality. 

In addition to personal aggregators, planet sites are used by online communities to aggregate 
community blogs in a centralised location. Such sites are named after the Planet aggregator, an 
application designed for this purpose. 


Client software 
Client software aggregators are installed applications designed to collect Web feed 


subscriptions and group them together using a user-friendly interface. The graphical user 
interface of such applications often closely resembles that of popular e-mail clients, using a 
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three-panel composition in which subscriptions are grouped in a frame on the left, and individual 
entries are browsed, selected, and read in a frames on the right. 

Software aggregators can also take the form of news tickers which scroll feeds like ticker 
tape, alerters that display updates in windows as they are refreshed, or as smaller components 
(sometimes called plugins or extensions), which can integrate feeds into the Operating System or 
software applications such as a Web browser. 


Media Aggregators 


Media aggregators are sometimes referred to as "Podcatchers" due to the popularity of the 
term "podcast" used to refer to a web feed containing audio or video. Media aggregators refer to 
applications, client software or Web based, which maintain subscriptions to feeds that contain 
audio or video media enclosures. They can be used to automatically download media, playback 
the media within the application interface, or synchronize media content with a portable media 
player. 


Feed filtering 


One of the problems with news aggregators is that the volume of articles can sometimes be 
overwhelming, especially when the user has many Web feed subscriptions. As a solution, many 
feed readers allow users to tag each feed with one or more keywords which can be used to sort 
and filter the available articles into easily navigable categories. 


Podcasting 


A podcast is a digital media file, or a series of such files, that is distributed over the Internet 
using syndication feeds for playback on portable media players and personal computers. A 
podcast is a specific type of webcast which, like 'radio', can mean either the content itself or the 
method by which it is syndicated; the latter is also termed podcasting. The host or author of a 
podcast is often called a podcaster. The term "podcast" is a portmanteau of the name of Apple's 
portable music player, the iPod, and broadcast[1]; a "pod" refers to the iPod, and "cast" to the 
idea of broadcasting. 

In other words, a podcast is a collection of files (usually audio but may include video) 
residing at a unique web feed address. People can "subscribe" to this feed by submitting the feed 
address to an aggregator (like iTunes - software that runs on the consumer's computer). When 
new "episodes" become available in the podcast they will be automatically downloaded to that 
user's computer. Unlike radio or streaming content on the web, podcasts are not real-time. The 
material is pre-recorded and users can check out the material at their leisure, offline. 

Though podcasters' web sites may also offer direct download or streaming of their content, a 
podcast is distinguished from other digital media formats by its ability to be syndicated, 
subscribed to, and downloaded automatically, using an aggregator or feed reader capable of 
reading feed formats such as RSS or Atom. 

Certain podcasts can even be live and interactive. Dozens of podcast enthusiasts can be on at 
once, with the "host" being able to control their audience in the same way a radio host can. 
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Mechanics 


Podcasting is an automatic mechanism whereby multimedia computer files are transferred 
from a server to a client, which pulls down XML files containing the Internet addresses of the 
media files. In general, these files contain audio or video, but also could be images, text, PDF, or 
any file type. 

The content provider begins by making a file (for example, an MP3 audio file) available on 
the Internet. This is usually done by posting the file on a publicly available webserver; however, 
BitTorrent trackers also have been used, and it is not technically necessary that the file be 
publicly accessible. The only requirement is that the file be accessible through some known URI 
(a general-purpose Internet address). This file is often referred to as one episode of a podcast. 

The content provider then acknowledges the existence of that file by referencing it in another 
file known as the feed. The feed is a list of the URLs by which episodes of the show may be 
accessed. This list is usually published in RSS format (although Atom can also be used), which 
provides other information, such as publish date, titles, and accompanying text descriptions of 
the series and each of its episodes. The feed may contain entries for all episodes in the series, but 
is typically limited to a short list of the most recent episodes, as is the case with many news 
feeds. Standard podcasts consist of a feed from one author. More recently, multiple authors have 
been able to contribute episodes to a single podcast feed using concepts such as_ public 
podcasting and social podcasting. 

The content provider posts the feed on a webserver. The location at which the feed is posted 
is expected to be permanent. This location is known as the feed URI (or, perhaps more often, feed 
URL). The content provider makes this feed URI known to the intended audience. 

A consumer uses a type of software known as an aggregator, sometimes called a podcatcher 
or podcast receiver, to subscribe to and manage their feeds. 

A podcast specific aggregator is usually an always-on program which starts when the 
computer is started and runs in the background. It works exactly like any newsreader each at a 
specified interval, such as every two hours. If the feed data has substantively changed from when 
it was previously checked (or if the feed was just added to the application's list), the program 
determines the location of the most recent item and automatically downloads it. The downloaded 
episodes can then be played, replayed, or archived as with any other computer file. Many 
applications also automatically transfer the newly downloaded episodes available to a user's 
portable media player, which is connected to the PC running the aggregator, perhaps via a USB 
cable. 

The publish/subscribe model of podcasting is a version of push technology, in that the 
information provider chooses which files to offer in a feed and the subscriber chooses among 
available feed channels. While the user is not "pulling" individual files from the Web, there is a 
strong "pull" aspect in that the receiver is free to subscribe to (or unsubscribe from) a vast array 
of channels. Earlier Internet "push" services (e.g., PointCast) allowed a much more limited 
selection of content. 

In March 2006 it was reported that 80% of podcast "episodes" are "consumed" on the PC 
onto which they are downloaded i.e. they are never actually transferred to an iPod or other 
portable player, or are deleted from the PC without being listened to.”! However, the latest 
version of the iTunes program will stop downloading new podcasts that have been subscribed to 
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if it detects they are not being listened to. Thus the percentage of unlistened podcasts is 
controlled through this mechanism. 

To conserve bandwidth, users may opt to search for content using an online podcast 
directory. Some directories allow people to listen online and become familiar with the content 
provided from an RSS feed before deciding to subscribe. For most broadband users, bandwidth is 
generally not a major consideration. 


Managing Podcasts on an iPod 


Only the latest un-listened podcasts are put in this playlist. The user can thus ensure that 
all news programs are new and current and also un-listened. This playlist is sychronised 
regularly with the iPod. 

iTunes offers the ability to create "Smart Playlists" which can be used to control which 
podcasts are in the playlist using multiple criteria such as date, number of times listened, type, 
etc. It is also possible to set up iTunes so that only certain playlists will be synched with the 
iPod. By using a combination ot the two techniques it is possible to control exactly which music 
and/or podcasts will be transferred to the iPod. The illustration to the right shows one such 
"Smart playlist" which ensures that only the latest un-listened podcasts will be in the smart 
playlist. Any podcast which is more that two weeks old is not included, nor is any podcast that 
the iPod user has already listened to. This smart playlist is synched with the iPod every time the 
iPod is plugged into the pc ensuring that the user does not have to listen to the same show more 
than once. Once a podcast has been listened to, it will be removed from this list as soon as the 
iPod is synched with the PC. There are many criteria which can control what goes in a smart 
playlist, such as "name", "artist", "category", "grouping", "kind", "last played", "play count", 
"rating", "last skipped" and "playlist" and these can be combined with funtions such as "equals", 


"is greater than", "is less than", "contains", "is true", "is false", "is", "is not", "does not contain", 
at FS " " 


"starts with", "ends with", "is in the range" "is before", "is after". As a result it is possible to 
control exactly which podcasts are transferred to the iPod. 


Example 


For example, a user may only want news programs less than 24 hours old, unlistened science 
programs less than one month old, and all Spanish lessons that he/she has yet listened to less than 
three times. By using smart playlists, she/he can ensure that these rules will be followed. The 
user would set up four smart playlists. The first smart playlist containing news podcasts 
downloaded in the last 24 hours, the second containing the science podcasts which are unlistened 
and less than one month old, and a third smart playlist containing Spanish lessons which have 
been listened less than three times, and a fourth smart playlist which contains the contents of the 
first three. The fourth smart playlist is the one which would be synched with the iPod. Obviously 
the fourth playlist may contain many other play lists as well as the ones described above. 


Other uses 
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Podcasting's initial appeal was to allow individuals to distribute their own "radio shows," but 
the system quickly became used in a wide variety of other ways, including distribution of school 
lessons,"! official and unofficial audio tours of museums, conference meeting alerts and updates, 
and by police departments to distribute public safety messages. For example, the Pediaphon 
project provides dynamically generated podcasts of all English, French and German language 
Wikipedia articles. 

Podcasting is becoming increasingly popular in education. Podcasts enable students and 
teachers to share information with anyone at anytime. An absent student can download the 
podcast of the recorded lesson. It can be a tool for teachers or administrators to communicate 
curriculum, assignments and other information with parents and the community. Teachers can 
record book talks, vocabulary or foreign language lessons, international pen pal letters (podcast 
pals!), music performance, interviews, debates. Podcasting can be a publishing tool for student 
oral presentations. Video podcasts can be used in all these ways as well. 


Trademarks 


In February 5, 2005, Shae Spencer Management LLC of Fairport, New York filed a 
trademark application to register PODCAST for an ‘online prerecorded radio program over the 
internet'.[4] In 2005-9-9, United States Patent and Trademark Office rejected the application. 
The rejection notice cited Wikipedia's Podcast entry had described the history of the term."! 

As of September 19, 2005, known trademarks that capitalize on podcast include: Podcast 
Realty, GuidePod, PodGizmo, Pod-Casting, MyPod, Podvertiser, ePodcast, PodCabin, Podcaster, 
PodShop, PodKitchen, Podgram, GodPod and Podcast.'*! 

As of February 2007, there have been 24 attempts to register trademarks containing the word 
'PODCAST' in United States, but only 'PODCAST READY' from Podcast Ready, Inc. was 
approved." 


Other 


In November of 2004, Pittsburgh-based podcast hosting service Liberated Syndication was 
launched. The company, also known as Libsyn, was one of the first podcast hosting companies 
on the scene. 

In 2005, it was reported that Adam Curry had anonymously edited the podcasting entry on 
Wikipedia to remove credits from other people and to inflate his role in its creation.'! The 
business model of Curry's podcasting network Podshow has since been criticised by many in the 
industry, and has been accused of exploitative practices in its dealings with independent 
podcasters. 

In September 26, 2006, It was reported that Apple Computer started to crack down on 
businesses using the word 'pod' in product and company names. Apple sent a cease-and-desist 
order that week to Podcast Ready, which markets an application known as myPodder."! Lawyers 
for Apple contended that the term 'pod' has been used by the public to refer to Apple's music 
player so extensively that it falls under Apple's trademark protection.''®! It was speculated that 
activity was part of a bigger campaign for Apple to expand the scope of its existing iPod 
trademark, which included trademarking 'TRODCAST’, 'IPOD Socks’, 'POD'.[11] On November 
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16, 2006, Apple Trademark Department returned a letter claiming Apple does not object to third 
party usage of ‘podcast to refer to podcasting services, and Apple does not license the term.!”! 


See also 


e Aggregator for a list of podcasting aggregators 
e Video podcast vidcast - vidcasting (video podcasting) 


Syndication protocols 


e RSS 
e Atom 


Video podcasts 


Video podcasts 


Video podcast (sometimes shortened to vidcast or vodcast) is a term used for the online 
delivery of video on demand video clip content via Atom or RSS enclosures. The term is an 
evolution specialized for video, coming from the generally audio-based podcast and referring to 
the distribution of video where the RSS feed is used as a non-linear TV channel to which 
consumers can subscribe using a PC, TV, set-top box, media center or mobile multimedia device. 


Technology and History 


From a web server, a video podcast can be distributed as a file or as a stream. Both methods 
have their advantages and disadvantages. Downloading complete video podcasts in advance 
gives the user the ability to play the video podcasts offline on, for example, a portable media 
player. A downloaded version can be watched many times with only one download, reducing 
bandwidth costs in this case. Streaming allows seeking (skipping portions of the file) without 
downloading the full video podcast, better statistics and lower bandwidth costs for the servers; 
however, users may have to face pauses in playback caused by slow transfer speeds. 

A podcast client may work with a separate, or integrated player. One such example of the 
latter is iTunes, which is an unusual case of a web feed aggregator being added to a media player 
rather than vice versa. 

Channeling of on-demand content has major benefits for both publishers and subscribers. 
Publishers can still bundle content for their audience. Subscribers can consume content on- 
demand and don't need keyboard or mouse interfaces to choose channels and items, much similar 
to zapping through regular TV stations, and new episodes show up automatically, so the 
technology is ideal for on-the-go (portable media players and mobile phones) and living room 
mass media consumption. 


Uses in Education 
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Video podcasts enable students and teachers to share information with anyone anytime. If a 
student is absent, they can download the podcast of the recorded lesson. It can be a tool for 
teachers or administrators to communicate curriculum, assignments and other information with 
parents and the community. Teachers can record book talks, vocabulary or foreign language 
lessons, international pen pal letters (podcast pals!), music performance, interviews, debates. 
Video podcasting can be a publishing tool for student oral presentations. Audio podcasts can be 
used in all these ways as well. It also allows people to leave a journal. 


Timeline 


e On 31 October 2003 [1] Canadian film collective Rocket Ace Moving Pictures 
launches the serialized web-based video zombie comedy project Dead End Days 
at DeadEndDays.com while referred to by its creators as a "web-serial" 
throughout 2003 and 2004, it is adopted by the podcast community as a 
progenitor [2]. The creators retroactively add RSS features to the site. 

e On 1 January 2004 Steve Garfield launches his video blog and declares that 
2004 would be the year of the video blog.!21 

e On October 1st 2005 Terra: The Nature of Our World launched making it the 
first Natural History video podcast. It is produced in conjunction with the MFA 
program in Science & Natural History Filmmaking at Montana State University, 
Filmmakers for Conservation, and PBS.131 

e On the 8 June 2006, the German chancellor Angela Merkel, launch her video 
podcast via the bundeskanzlerin website, making her the first head of 
government to launch a regular video podcast." 

e On December 16, 2006, Time magazine named its Person of the Year as "You", 
referring to independent content creators on the web and video podcasters in 
particular. Crash Test Kitchen, a video podcast begun in April 2005 by home 
cooks Waz and Lenny, was featured in the magazine as an example of this 
phenomenon. The magazine itself was a distinctive issue that featured a flexible 
mirror on the cover. 

e On April 13, 2007, the British panel game Have I Got News for You started to 
broadcast a vodcast called Have I Got News for You: The Inevitable Internet 
Spin-off. The first HIGNFY vodcast was presented by Jeremy Clarkson. 

e On 25 May, 2007, another British panel game, QJ, also announced that it would 
create a vodcast, starting in Series E.t! 


See also 


e Podcast 


Video bloggers 
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A video blog, sometimes shortened to vlog "7!" is a blog that comprises video “!. Regular 
entries are typically presented in reverse chronological order and often combine embedded video 
or a video link with supporting text, images, and other metadata. 

Vlogs also often take advantage of web syndication to allow for the distribution of video over 
the Internet using either the R.S.S. or Atom syndication formats, for automatic aggregation and 
playback on mobile devices and personal computers (See video podcast). Though many vlogs are 
collaborative efforts, the majority of vlogs and vlog entries are authored by individuals 


History 


Vlogging arose as a video form of blogging but video blogging is also, in another sense, a 
continuation and expansion of video diaries and bulletins, which is a form of community media 
having a history dating back to the 1980s . 

In the 1980s video cameras became more widely available in industrialised countries. The 
B.B.C. noted the growth of a social phenomenon of video diary making, and in 1993 Chris Mohr 
and Mandy Rose of the B.B.C. Community Programmes United ... the television series, "Video 
Nation", in which members of the public, across the U.K., were given Hi-8 video cameras for 
one year, during which time they filmed their everyday lives. 

More than 10,000 video tapes were recorded and sent to the B.B.C., from which 
approximately 1300 shorts were edited and shown on T.V. The first to be shown was called 
"Mirror" and was made by Gordon Hencher. Viewing figures were high (from 1 million to 9 
million) and led to some themed series of "Video Nation" shorts, such as African Shorts, Hong 
Kong Shorts, Coming Clean, Bitesize Britain and others. 

The BBC's "Video Nation" is still running and operates in 33 regional centres throughout 
England and Wales these days experimenting with new film forms and technologies such as iTV, 
H.D. and online social networking. "Video Nation" has a sub-website within the enormous 
structure of the BBC's site [1]. 

During the same few years that "Video Nation" was getting started the internet was evolving 
into a W.W.W. but, at first, the bandwidth and processing power required to do video online 
made it a difficult proposition for most people. 

Gradually computers and connection speeds improved. 

Vlogging saw a strong increase in popularity beginning in the year 2005. The Yahoo! 
Videoblogging Group saw its membership increase dramatically in 2005 "%!. The most popular 
video sharing site to date, YouTube, founded in February 2005, was publicly launched between 
August and November 2005 . 


Significant events in the development of video blogs 


e 2000, November - Adrian Miles posts the first (known) video blog entry on 
November 27, 2000. (1 

e 2003, June 15 - Nacho Duran launches the first (known) South American (Sao 
Paulo, Brazil) videoblog based on soundless loops made out of sequences of 
pictures taken from a portable webcam. 171 

e 2004, January 1 - Steve Garfield launches his videoblog and declares that 2004 
would be the year of the video blog. 'il! 
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e 2004, June 1 - Jay Dedman starts the Yahoo! Videoblogging Group, which 
becomes the center of a community of vloggersi 

e 2004, December - mefeedia, the first videoblog directory and aggregator is 
released. 

e 2005, January - Vloggercon, the first videoblogger conference, is held in New 
York City 14 

e 2005, February - FreeVlog, a guide to creating a video blog, launches.121 

e 2005, May - Steve Jobs announces audio and video podcast support in iTunes.(3) 

e 2005, June - ViogMap launches and begins mapping both vloggers and videos.'"41 

e 2005, July 20 - The Yahoo! Videoblogging Group grows to over 1,000 
members.115115! 

e 2006, June - Vloggercon 2006, the second annual videoblogger conference, is 
held in San Francisco. 

e 2006, July - Youtube has become the 5th most popular web destination, with 
100 million videos viewed daily, and 65,000 new uploads per day. («1 

e 2006, July 5 - Host, Amanda Congdon, leaves Rocketboom over differences with 
her business partner Andrew Baron.(171i181 

e 2006, November - The Vloggies, the first annual videoblogging awards, is held in 
San Francisco.(9! 

e 2006, November - Google purchases YouTube for $1.65 billion in stock, its 
biggest acquisition to date (201 


Notable video blogs 


Google Video ran its own video blog on blogspot.com from October 2005 to March 2007. 
Launched with a message from Google Product Marketing Manager Eva Ho, this vlog 
showcased examples of the videos being uploaded and shared at Google. The final Google 
videoblog was on March 2 2007 and carried the headline "Not really goodbye, just a change of 
address" [2]. The final Google vlog also contains an invitation to "check out the YouTube Blog 
to find out what the YT editors consider must-see". 

The YouTube Blog[3] has been going since July 2005. The May 3, 2007 entry "announces 
that thousands of YouTube's most popular video makers will, in future, be getting the financial 
benefits enjoyed by YouTube's professional content partners. The article mentions six ultra 
popular content providers by name: 

e Lonelygirl15 
LisaNova 
renetto 
HappySlip 
smosh 

e valsartdiary. 

Lonelygirl15 is not the vlog of a lonely 15 year old girl but is actually a fictional story 
performed by actress Jessica Lee Rose, Yousef Abu-Taleb and associates. The story is presented 
on YouTube and Revver in the form of a video log. It gave the appearance of reality at first but 
was eventually discovered and branded a hoax. Fame followed quickly after discovery. Jessica 
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Rose has since been interviewed on Jay Leno's show and on MTV. She has also appeared on the 
cover of Wired magazine. 

In 2007 the United States military force in Iraq began a video blog on YouTube, giving a 
view of the Iraq campaign unfiltered by any civilian journalists. [4] It has since become one of 
the most popular pages on the site. 

Other popular YouTube blogs are: 

e Nalts 
e WifeOfNalts 
e Pipistrello 
e Dave's Farm 
Other popular/historic video blogs: 
e Rocketboom. One of the most popular daily news video blogs. 
e Ze Frank Perhaps the most famous one man, daily vlog. The show ran for one 
year per his original intent. 


Further reading 


e Jay Dedman, Joshua Paul. Videoblogging, John Wiley & Sons, June 26, 2006. ISBN 
0470037881. 

e Michael Verdi, Ryanne Hodson, Diana Weynand, Shirley Craig. Secrets of 
Videoblogging, Peachpit Press, April 25, 2006. ISBN 0321429176. 

e Stephanie Cottrell Bryant. Videoblogging For Dummies, For Dummies, July 12, 
2006. ISBN 0471971774. 

e Lionel Felix, Damien Stolarz. Hands-On Guide to Video Blogging and Podcasting: 
Emerging Media Tools for Business Communication, Focal Press, April 24, 2006. 
ISBN 0240808312. 


e Andreassen, T. B. & Berry, D M. (2006). Conservatives 2.0. Minerva. Norway. Nr 
08 2006. pp 92-95 


Web syndication formats 


The following is a list of formats for web feeds. 


Major markup languages 


e Atom 
e RSS 


Minor markup languages 


e ICE 
OML 
OPML 
SyncML 
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XOXO 
GData [1] 
Historical 
e CDF 


e Marimba 
e Pointcast 
Atom | RSS | FeedBurner | Push technology | Web feed 


Atom 


File extension: 


.atom, .xml 


MIME type: 


application/atom+xml 


Type of format: 


Syndication 


Extended from: 
XML 


The name Atom applies to a pair of related standards. The Atom Syndication Format is an 
XML language used for web feeds, while the Atom Publishing Protocol (APP is the acronym, 
but it is referred to as 'AtomPub' for short) is a simple HTTP-based protocol for creating and 
updating Web resources. 

Web feeds allow software programs to check for updates published on a web site. To provide 
a web feed, a site owner may use specialized software (such as a content management system) 
that publishes a list (or "feed") of recent articles or content in a standardized, machine-readable 
format. The feed can then be downloaded by web sites that syndicate content from the feed, or 
by feed reader programs that allow Internet users to subscribe to feeds and view their content. 

A feed contains entries, which may be headlines, full-text articles, excerpts, summaries, 
and/or links to content on a web site, along with various metadata. 

The development of Atom was motivated by the existence of many incompatible versions of 
the RSS syndication format, all of which had shortcomings, and the poor interoperability [1] of 
XML-RPC-based publishing protocols. The Atom syndication format was published as an IETF 
"proposed standard" in RFC 4287. The Atom Publishing Protocol is still in draft form. 
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Usage 


Web feeds are used by the weblog community to share the latest entries' headlines or their 
full text, and even attached multimedia files. (See podcasting, vodcasting, broadcasting, 
screencasting, Vloging, and MP3 blogs.) These providers allow other websites to incorporate the 
weblog's "syndicated" headline or headline-and-short-summary feeds under various usage 
agreements. Atom and other web syndication formats are now used for many purposes, including 
journalism, marketing, bug-reports, or any other activity involving periodic updates or 
publications. Atom also provides a standardized way to export an entire blog, or parts of it, for 
backup or for importing into other blogging systems. 

A program known as a feed reader or aggregator can check webpages on behalf of a user and 
display any updated articles that it finds. It is common to find web feeds on major Web sites, as 
well as many smaller ones. Some websites let people choose between RSS or Atom formatted 
web feeds; others offer only RSS or only Atom. In particular, many blog and wiki sites offer 
their web feeds in the Atom format. 

Client-side readers and aggregators may be designed as standalone programs or as extensions 
to existing programs like web browsers. Browsers are moving toward integrated feed reader 
functions, such as Safari RSS, Web Browser for S60, Opera, Firefox and Internet Explorer. Such 
programs are available for various operating systems. 

Web-based feed readers and news aggregators require no software installation and make the 
user's "feeds" available on any computer with Web access. Some aggregators syndicate 
(combine) web feeds into new feeds, e.g., taking all football related items from several sports 
feeds and providing a new football feed. There are also search engines for content published via 
web feeds, including Technorati and Blogdigger. 

On Web pages, web feeds (Atom or RSS) are typically linked with the word "Subscribe" or 
with the unofficial web feed logo. 


Atom Compared to RSS 2.0 


The main motivation for the development of Atom was dissatisfaction with RSS. Among 
other things, there are multiple incompatible and widely adopted versions of RSS. The intention 
was to ease the difficulty of developing applications with web syndication feeds. 

A brief description of the ways Atom 1.0 seeks to differentiate itself from RSS 2.0 follows: 

e RSS 2.0 may contain either plain text or escaped HTML as a payload, with no 
way to indicate which of the two is provided. Atom in contrast uses an explicitly 
labeled (i.e. typed) "entry" (payload) container. It allows for a wider variety of 
payload types including plain text, escaped HTML, XHTML, XML, Base64- 
encoded binary, and references to external content such as documents, video 
and audio streams, and so forth. 

e RSS 2.0 has a "description" element which can contain either a full entry or just 
a description. Atom has separate “summary” and “content” elements. Atom thus 
allows the inclusion of non-textual content that can be described by the 
summary. 

e Atom is defined within an XML namespace whereas RSS 2.0 is not. 
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e Atom specifies use of the XML's built-in xml:base for relative URIs. RSS 2.0 does 
not have a means of differentiating between relative and non-relative URIs. 

e Atom uses XML's built-in xml:lang attribute as opposed to RSS 2.0's use of its 
own "language" element. 

e In Atom, it is mandatory that each entry have a globally unique ID, which is 
important for reliable updating of entries. 

e Atom 1.0 allows standalone Atom Entry documents whereas with RSS 2.0 only 
full feed documents are supported. 

e Atom specifies that dates be in the format described in RFC 3339 (which is a 
subset of ISO 8601). The date format in RSS 2.0 was underspecified and has led 
to many different formats being used. 

e Atom 1.0 has IANA-registered MIME-type. RSS 2.0 feeds are often sent as 
application/rss+xml, although it is not a_ registered MIME-type. 
Atom 1.0 includes an XML schema. RSS 2.0 does not. 

e Atom is an open and evolvable standard developed through the IETF 
standardization process. RSS 2.0 is not standardized by any standards body. 
Furthermore according to its copyright it may not be modified. 

e Atom 1.0 elements can be used as extensions to other XML vocabularies, 
including RSS 2.0 as illustrated in a weblog post by Tim Bray entitled "Atomic 
RSS". 

e Atom 1.0 describes how feeds and entries may be digitally signed using the XML 
Digital Signature specification such that entries can be copied across multiple 
Feed Documents without breaking the signature. 

e Despite the emergence of Atom as an JETF Proposed Standard and the decision 
by major companies such as Google to embrace Atom, use of the older and more 
widely known RSS 1.0 and RSS 2.0 formats has continued. 

e RSS 2.0 support for enclosures led directly to the development of podcasting. 
While many podcasting applications, such as iTunes, support the use of Atom 
1.0, RSS 2.0 remains the preferred format. 

e Many sites choose to publish their feeds in only a single format. For example 
CNN, the New York Times, and the BBC offer their web feeds only in RSS 2.0 
format. 

e News articles about web syndication feeds have increasingly used the term 
"RSS" to refer generically to any of the several variants of the RSS format such 
as RSS 2.0 and RSS 1.0 as well as the Atom format. (For example, "Fine-Tuning 
Your Filter for Online Information" (NYT) and "There's a Popular New Code for 
Deals: RSS" (NYT January 29, 2006)) 

e Each of the various web syndication feed formats has attracted large groups of 
supporters who remain satisfied by the specification and capabilities of their 
respective formats. 


Development History 


Background 
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Before the creation of Atom the primary method of web content syndication was the RSS 
family of formats. 

Members of the community who felt there were significant deficiencies with this family of 
formats were unable to make changes directly to RSS 2.0 because the official specification 
document stated that it was purposely frozen to ensure its stability.!"! 


Initial Work 


In June 2003, Sam Ruby set up a wiki to discuss what makes "a well-formed log entry". This 
initial posting acted as a rallying point. People quickly started using the wiki to discuss a new 
syndication format to address the shortcomings of RSS. It also became clear that the new format 
could also form the basis of a more robust replacement for blog editing protocols such as 
Blogger API and LiveJournal XML-RPC Client/Server Protocol. 

The project aimed to develop a web syndication format that was: 

e "100% vendor neutral,” 

e "implemented by everybody," 

e "freely extensible by anybody, and" 
e "cleanly and thoroughly specified." 

In short order, a project road map was built. The effort quickly attracted more than 150 
supporters including David Sifry of Technorati, Mena Trott of Six Apart, Brad Fitzpatrick of 
LiveJournal, Jason Shellen of Blogger, Jeremy Zawodny of Yahoo, Timothy Appnel of the 
O'Reilly Network, Glenn Otis Brown of Creative Commons and Lawrence Lessig. Other 
notables supporting Atom include Mark Pilgrim, Tim Bray, Aaron Swartz, Joi Ito, and Jack Park. 
Also, Dave Winer, the key figure behind RSS 2.0, gave tentative support to the Atom endeavor 
(which at the time was called Echo.) 

After this point, discussion became chaotic, due to the lack of a decision-making process. 
The project also lacked a name, tentatively using "Pie," "Echo," and "Necho" before settling on 
Atom. After releasing a project snapshot known as Atom 0.2 in early July 2003, discussion was 
shifted off the wiki. 


Atom 0.3 and Adoption by Google 


The discussion then moved to a newly set up mailing list. The next and final snapshot during 
this phase was Atom 0.3, released in December 2003. This version gained widespread adoption 
in syndication tools, and in particular it was added to several Google-related services, such as 
Blogger, Google News, and Gmail. Google's Data APIs (Beta) GData are based on Atom 1.0 and 
RSS 2.0. 


Atom 1.0 and IETF Standardization 


In 2004, discussions began about moving the project to a standards body such as the World 
Wide Web Consortium or the Internet Engineering Task Force (IETF). The group eventually 
chose the IETF and the Atompub working group was formally set up in June 2004, finally giving 
the project a charter and process. The Atompub working group is co-chaired by Tim Bray (the 
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co-editor of the XML specification) and Paul Hoffman. Initial development was focused on the 
syndication format. 

The Atom Syndication Format was issued as a Proposed Standard in IETF RFC 4287 in 
December 2005. The co-editors were Mark Nottingham and Robert Sayre. 

This document is known as 'atompub-format’ in IETF's terminology. As of May, 2007 three 
other drafts were still being worked on as part of the IETF process. They are atompub-protocol, 
atompub-typeparam and atompub-autodiscovery. Their contents and current status can be viewed 
at IETF's web site. 


Example of an Atom 1.0 Feed 
An example of a document in the Atom Syndication Format: 


<?xml version="1.0" encoding="utf-8"?> <feed xmIins="http://www.w3.org/2005/Atom"> 
<title>Example Feed</title> <subtitle>A subtitle.</subtitle> <link 
href="http://example.org/feed/" rel="self"/> <link href="http://example.org/"/> 
<updated>2003-12-131T18:30:02Z</updated> <author> <name>John Doe</name> 
<email>johndoe@example.com</email> </author> <id>urn:uuid:60a76c80-d399-11d9-b91C- 
0003939e0af6</id> <entry> <title>Atom-Powered Robots Run Amok</title> <link 
href="http://example.org/2003/12/13/atom03"/> <id>urn:uuid:1225c695-cfb8-4ebb-aaaa- 
80da344efa6a</id> <updated>2003-12-13T18:30:02Z</updated> <summary>Some 
text.</summary> </entry> </feed> 


See also 


e hAtom - microformat for marking up (X)HTML so that Atom feeds can be 
derived from it. 

e Web syndication 

e List of content syndication markup languages 

e Podcasting 

e RSS 


RSS 


File extension: 


.rss, .xml 


MIME type: 


application/rss+xml (Registration Being Prepared)" 
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Extended from: 
XML 


RSS (which, in its latest format, stands for "Really Simple Syndication") is a family of web 
feed formats used to publish frequently updated content such as blog entries, news headlines or 
podcasts. An RSS document, which is called a "feed", "web feed", or "channel", contains either a 
summary of content from an associated web site or the full text. RSS makes it possible for people 
to keep up with their favorite web sites in an automated manner that's easier than checking them 
manually. 

RSS content can be read using software called a "feed reader" or an "aggregator." The user 
subscribes to a feed by entering the feed's link into the reader or by clicking an RSS icon in a 
browser that initiates the subscription process. The reader checks the user's subscribed feeds 
regularly for new content, downloading any updates that it finds. 

The initials "RSS" are used to refer to the following formats: 

e Really Simple Syndication (RSS 2.0) 
e RDF Site Summary (RSS 1.0 and RSS 0.90) 
e Rich Site Summary (RSS 0.91) 

RSS formats are specified using XML, a generic specification for the creation of data 

formats. 


History 


Before RSS, several similar formats already existed for syndication, but none achieved 
widespread popularity or are still in common use today, as most were envisioned to work only 
with a single service. The basic idea of re-structuring metadata information about web sites has 
been traced back at least as far as 1995, and the work of Ramanathan V. Guha and others at 
Apple Computer's Advanced Technology Group developing the Meta Content Framework 
(MCF).[2] Other early work on XML syndication formats, including RDF, took place at 
Netscape, Userland Software, and Microsoft. 

RDF Site Summary, the first version of RSS, was created by Ramanathan V. Guha of 
Netscape in March 1999 for use on the My Netscape portal. This version became known as RSS 
0.9.8 

In July 1999, responding to comments and suggestions, Dan Libby produced a prototype 
tentatively named RSS 0.91[4] (RSS standing for Rich Site Summary), that simplified the format 
and incorporated parts of Dave Winer's Scripting News format. ©! This they considered an 
interim measure, with Libby suggesting an RSS 1.0-like format through the so-called Futures 
Document." 

In April 2001, in the midst of AOL's acquisition and subsequent restructuring of Netscape 
properties, a re-design of the My Netscape portal removed RSS/XML support. The RSS 0.91 
DTD was removed during this re-design, but in response to feedback, Dan Libby was able to 
restore the DTD, but not the RSS validator previously in place. In response to comments within 
the RSS community at the time, Lars Marius Garshol, to whom (co?)authorship of the original 
0.9 DTD is sometimes attributed, commented, "What I don't understand is all this fuss over 
Netscape removing the DTD. A well-designed RSS tool, whether it validates or not, would not 
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use the DTD at Netscape's site in any case. There are several mechanisms which can be used to 
control the dereferencing of references from XML documents to their DTDs. These should be 
used. If not the result will be as described in the article." "7! 

Effectively, this left the format without an owner, just as it was becoming widely used. 

A working group and mailing list, RSS-DEV, was set up by various users and XML notables 
to continue its development. At the same time, Winer unilaterally posted a modified version of 
the RSS 0.91 specification to the Userland website, since it was already in use in their products. 
He claimed the RSS 0.91 specification was the property of his company, UserLand Software." 
Since neither side had any official claim on the name or the format, arguments raged whenever 
either side claimed RSS as its own, creating what became known as the RSS fork. 

The RSS-DEV group went on to produce RSS 1.0 in December 2000."! Like RSS 0.9 (but 
not 0.91) this was based on the RDF specifications, but was more modular, with many of the 
terms coming from standard metadata vocabularies such as Dublin Core. 

Nineteen days later, Winer released by himself RSS 0.92,"° a minor and supposedly 
compatible set of changes to RSS 0.91 based on the same proposal. In April 2001, he published a 
draft of RSS 0.93 which was almost identical to 0.92."" A draft RSS 0.94 surfaced in August, 
reverting the changes made in 0.93, and adding a type attribute to the description element. 

In September 2002, Winer released a final successor to RSS 0.92, known as RSS 2.0 and 
emphasizing "Really Simple Syndication" as the meaning of the three-letter abbreviation. The 
RSS 2.0 spec removed the type attribute added in RSS 0.94 and allowed people to add extension 
elements using XML namespaces. Several versions of RSS 2.0 were released, but the version 
number of the document model was not changed. 

In November 2002, The New York Times began offering its readers the ability to subscribe 
to RSS news feeds related to various topics. In January, 2003, Winer called the New York Times' 
adoption of RSS the "tipping point" in driving the RSS format's becoming a de facto standard. 

In July 2003, Winer and Userland Software assigned ownership of the RSS 2.0 specification 
to his then workplace, Harvard's Berkman Center for the Internet & Society.!"”! 

In January 2005, Sean B. Palmer, Christopher Schmidt, and Cody Woodard produced a 
preliminary draft of RSS 1.1."°! It was intended as a bugfix for 1.0, removing little-used features, 
simplifying the syntax and improving the specification based on the more recent RDF 
specifications. As of July 2005, RSS 1.1 had amounted to little more than an academic exercise. 

In April 2005, Apple Computer released Safari 2.0 with RSS Feed capabilities built in. Safari 
delivered the ability to read RSS feeds, and bookmark them, with built-in search features. 
Safari's RSS button is a blue rounded rectangle with RSS written inside in white, ms. The favicon 
displayed defaults to a newspaper icon &. 

In November 2005, Microsoft proposed its Simple Sharing Extensions to RSS.'"4! 

In December 2005, the Microsoft IE team and Outlook team announced in their blogs that 
they will be adopting the feed icon first used in the Mozilla Firefox browser 8, effectively 
making the orange square with white radio waves the industry standard for both RSS and related 
formats such as Atom. Also in February 2006, Opera Software announced they too would add 
the orange square in their Opera 9 release. 

In January 2006, Rogers Cadenhead relaunched an RSS Advisory Board with a view to 
continuing the development of the RSS format and resolving ambiguities. In June 2007, the 
board revised their version of the specification to confirm that namespaces may extend core 
elements with namespace attributes, as Microsoft has done in Internet Explorer 7. In their view, a 
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difference of interpretation left publishers unsure of whether this was permitted or forbidden. No 
press account of the differences between the Winer spec and the Cadenhead spec for RSS 2.0 is 
included in this article's references, though blog searches in May, 2007 found private opinions 
that the two specs were very similar. 


Incompatibilities 


As noted above, there are several different versions of RSS, falling into two major branches 
(RDF and 2.*). The RDF, or RSS 1.* branch includes the following versions: 

e RSS 0.90 was the original Netscape RSS version. This RSS was called RDF Site 
Summary, but was based on an early working draft of the RDF standard, and 
was not compatible with the final RDF Recommendation. 

e RSS 1.0 is an open format by the RSS-DEV Working Group, again standing for 
RDF Site Summary. RSS 1.0 is an RDF format like RSS 0.90, but not fully 
compatible with it, since 1.0 is based on the final RDF 1.0 Recommendation. 

e RSS 1.1 is also an open format and is intended to update and replace RSS 1.0. 
The specification is an independent draft not supported or endorsed in any way 
by the RSS-Dev Working Group or any other organization. 

The RSS 2.* branch (initially UserLand, now Harvard) includes the following versions: 

e RSS 0.91 is the simplified RSS version released by Netscape, and also the version 
number of the simplified version championed by Dave Winer from Userland 
Software. The Netscape version was now called Rich Site Summary, this was no 
longer an RDF format, but was relatively easy to use. It remains the most 
common RSS variant. 

e RSS 0.92 through 0.94 are expansions of the RSS 0.91 format, which are mostly 
compatible with each other and with Winer's version of RSS 0.91, but are not 
compatible with RSS 0.90. In all Userland RSS 0.9x specifications, RSS was no 
longer an acronym. 

e RSS 2.0.1 has the internal version number 2.0. RSS 2.0.1 was proclaimed to be 
"frozen", but still updated shortly after release without changing the version 
number. RSS now stood for Really Simple Syndication. The major change in this 
version is an explicit extension mechanism using XML Namespaces. 

For the most part, later versions in each branch are backward-compatible with earlier 
versions (aside from non-conformant RDF syntax in 0.90), and both versions include properly 
documented extension mechanisms using XML Namespaces, either directly (in the 2.* branch) 
or through RDF (in the 1.* branch). Most syndication software supports both branches. Mark 
Pilgrim's article "The Myth of RSS Compatibility" discusses RSS version compatibility in more 
detail. 

The extension mechanisms make it possible for each branch to track innovations in the other. 
For example, the RSS 2.* branch was the first to support enclosures, making it the current 
leading choice for podcasting, and as of mid-2005 is the format supported for that use by iTunes 
and other podcasting software; however, an enclosure extension is now available for the RSS 1.* 
branch, mod_enclosure [1]. Likewise, the RSS 2.* core specification does not support providing 
full-text in addition to a synopsis, but the RSS 1.* markup can be (and often is) used as an 
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extension. There are also several common outside extension packages available, including a new 
proposal from Microsoft for use in Internet Explorer 7. 

The most serious compatibility problem is with HTML markup. Userland's RSS reader— 
generally considered as the reference implementation—did not originally filter out HTML 
markup from feeds. As a result, publishers began placing HTML markup into the titles and 
descriptions of items in their RSS feeds. This behavior has become widely expected of readers, 
to the point of becoming a de facto standard, though there is still some inconsistency in how 
software handles this markup, particularly in titles. The RSS 2.0 specification was later updated 
to include examples of entity-encoded HTML, however all prior plain text usages remain valid. 


Atom 


Main article: Atom (standard) 

In reaction to recognized issues with RSS (and because RSS 2.0 is frozen), a third group 
began a new syndication specification, Atom, in June 2003. Their work was later adopted by the 
Internet Engineering Task Force (IETF) leading to the publication of a specification (RFC 4287) 
for the Atom Format in 2005. Work on the Atom Publishing Protocol, a standards-based protocol 
for posting to publishing tools is ongoing. 

The relative benefits of Atom in comparison to the two RSS branches are a matter of debate 
within the Web-syndication community. Supporters of Atom claim that it improves on RSS by 
relying on standard XML features, by specifying a payload container that can handle many 
different kinds of content unambiguously, and by having a specification maintained by a 
recognized standards organization. Critics claim that Atom unnecessarily introduces a third 
branch of syndication specifications, further confusing the marketplace. 

Atom aims to define both a syntax and a protocol for updating user blogs and thus goes 
beyond the simple remit of RSS. While this is appealing to many users, particularly those in the 
blogging community, it has been met with resistance in the professional community (mainly 
publishers) due to its lack of extensibility."'*! 

For a comparison of Atom 1.0 to RSS 2.0 see Atom Compared to RSS 2.0. 


Modules 


The primary objective of all RSS modules is to extend the basic XML schema established for 
more robust syndication of content. This inherently allows for more diverse, yet standardized, 
transactions without modifying the core RSS specification. 

To accomplish this extension, a tightly controlled vocabulary (in the RSS world, "module"; 
in the XML world, "schema") is declared through an XML namespace to give names to concepts 
and relationships between those concepts. 

Some RSS 2.0 modules with established namespaces: 

e Ecommerce RSS 2.0 Module 
e Media RSS 2.0 Module 
e OpenSearch RSS 2.0 Module 


BitTorrent and RSS 
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The peer-to-peer application BitTorrent has also announced support for RSS. Such feeds 
(also known as Torrent/RSS-es or Torrentcasts) will allow client applications to download files 
automatically from the moment the RSS reader detects them (also known as Broadcatching). 
Most common BitTorrent clients already offer RSS support. 


Examples 
RSS 1.0 


The following is an example of an RSS 1.0 file. 
<?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999 /02/22-rdf-syntax- 
ns#" xmlns="http://purl.org/rss/1.0/"> <channel 
rdf:about="http://www.xml.com/xml/news.rss"> <title>XML.com</title> 
<link>http://xml.com/pub</link> <description> XML.com features a rich mix of 
information and_ services for the XML community. </description> <image 
rdf:resource="http://xml.com/universal/images/xml_tiny.gif" /> <items> <rdf:Seq> <rdf:li 
rdf:resource="http://xml.com/pub/2000/08/09/xslt/xslt.html" /> <rdf:li 
rdf:resource="http://xml.com/pub/2000/08/09/rdfdb /index.html" /> </rdf:Seq> 
</items> <textinput rdf:resource="http://search.xml.com" /> </channel> <image 


rdf:about="http://xml.com/universal/images/xml_tiny.gif"> <title>XML.com</title> 
<link>http://www.xml.com</link> 

<url>http://xml.com/universal/images/xml_tiny.gif</url> </image> <item 
rdf:about="http://xml.com/pub/2000/08/09/xslt/xslt.html"> <title>Processing 
Inclusions with XSLT</title> 


<link>http://xml.com/pub/2000/08/09/xslt/xslt.html</link> <description> Processing 
document inclusions with general XML tools can be problematic. This article proposes a 
way of preserving inclusion information through SAX-based processing. </description> 
</item> <item rdf:about="http://xml.com/pub/2000/08/09/rdfdb/index.html"> 
<title>Putting RDF to Work</title> 
<link>http://xml.com/pub/2000/08/09/rdfdb/index.html</link> <description> Tool and 
API support for the Resource Description Framework is slowly coming of age. Edd Dumbill 
takes a look at RDFDB, one of the most exciting new RDF toolkits. </description> </item> 
<textinput rdf:about="http://search.xml.com"> <title>Search XML.com</title> 
<description>Search XML.com's XML _ collection</description> <name>s</name> 
<link>http://search.xml.com</link> </textinput> </rdf:RDF> 


RSS 2.0 


The following is an example of an RSS 2.0 file. 
<?xml version="1.0"?> <rss_ version="2.0"> <channel> <title>Liftoff News</title> 
<link>http://liftoff:msfc.nasa.gov/</link> <description> Liftoff to Space 
Exploration.</description> <language>en-us</language> <pubDate>Tue, 10 Jun 2003 
04:00:00 GMT</pubDate> <lastBuildDate>Tue, 10 =‘ Jun 2003 09:41:01 
GMT</lastBuildDate> <docs>http://blogs.law.harvard.edu/tech/rss</docs> 
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<generator>Weblog Editor 2.0</generator> 
<managingEditor>editor@example.com</managingEditor> 
<webMaster>webmaster@example.com</webMaster> <item> <title>Star City</title> 
<link>http://liftoff:msfc.nasa.gov/news/2003 /news-starcity.asp</link> 

<description>How do Americans get ready to work with Russians aboard the International 
Space Station? They take a crash course in culture, language and protocol at Russia's Star 
City.</description> <pubDate>Tue, 03 Jun 2003 09:39:21 GMT</pubDate> 
<guid>http://liftoffimsfc.nasa.gov/2003/06/03.html#item573</guid> </item> <item> 
<title>Space Exploration</title> <link>http://liftoff:msfc.nasa.gov/</link> 
<description>Sky watchers in Europe, Asia, and parts of Alaska and Canada will experience 
a partial eclipse of the Sun on Saturday, May 31st.</description> <pubDate>Fri, 30 May 


2003 11:06:42 GMT</pubDate> 
<guid>http://liftoffimsfc.nasa.gov/2003/05/30.html#item572</guid> </item> <item> 
<title>The Engine That Does More</title> 


<link>http://liftoffmsfc.nasa.gov/news/2003 /news-VASIMR.asp</link> 
<description>Before man travels to Mars, NASA hopes to design new engines that will let 
us fly through the Solar System more quickly. The proposed VASIMR engine would do 
that.</description> <pubDate>Tue, 27 May 2003 08:37:32 GMT</pubDate> 
<guid>http://liftoffmsfc.nasa.gov/2003/05/27.html#item571</guid> </item> <item> 
<title>Astronauts' Dirty Laundry</title> 
<link>http://liftoff:msfc.nasa.gov/news/2003 /news-laundry.asp</link> 
<description>Compared to earlier spacecraft, the International Space Station has many 
luxuries, but laundry facilities are not one of them. Instead, astronauts have other 
options.</description> <pubDate>Tue, 20 May 2003 08:56:02 GMT</pubDate> 
<guid>http://liftoffmsfc.nasa.gov/2003/05/20.html#item570</guid> </item> 
</channel> </rss> 


See also 
e Aggregators 
e Atom 
e Podcasting 
e Syndication 


Yahoo! Pipes, a tool for creating mashups of RSS feeds. 


RSS enclosure | Rss tracking | RSS TV | Yahoo! Pipes 


RSS enclosure 


RSS enclosures are a way of attaching multimedia content to RSS feeds by providing the 
URL of a file associated with an entry, such as an MP3 file to a music recommendation or a 
photo to a diary entry. Unlike e-mail attachments, enclosures are merely hyperlinks to files, the 
actual data is not embedded into the feed. Support and implementation among aggregators 


358 


NICOLAE SFETCU: WEB DESIGN & DEVELOPMENT 


varies: if the software understands the specified file format, it may automatically download and 
display the content, otherwise provide a link to it or silently ignore it. 

The addition of enclosures to RSS, as first implemented by Dave Winer in late 2000 [1], was 
an important prerequisite for the emergence of podcasting, arguably the most common use of the 
feature as of 2006. In podcasts and related technologies enclosures are not merely attachments to 
entries, but provide the main content of a feed. 


Syntax 


In RSS 2.0, the syntax for the <enclosure> tag, an optional child of the <item> element, is as 
follows: 
<enclosure url="http://domain.com/file.mp3" length="123456789" type="audio/mpeg" /> 

where the value of the url attribute is a URL of a file, length is its size in bytes, and type its 
mime type. 

There may only be a single <enclosure> per <item>. 


Similar technologies 


The RSS <enclosure> has similarities to: 
e the SMIL <prefetch> element, 
e the HTML <link> element with rel="prefetch".[2] 
e the HTTP Link header with rel="prefetch". (See RFC 2068 section 19.6.2.4.) 
e the Atom <link> element with rel="enclosure” 


See also 


e Podcast 


RSS tracking 


RSS Tracking is a methodology for tracking RSS feeds. 


History 


RSS feeds have been around since 1999 as a form of internet marketing, however unlike 
other forms of publishing information on the internet, it is difficult to track the usage of RSS 
feeds. Feed tracking methods have been growing in popularity 


Technology 


There are currently many methods of tracking RSS feeds, all with their own problems in 
terms of accuracy. 

Method 1 

Transparent 1x1 pixel images - These images can be embedded within the content of the RSS 
feed by linking to the image which should be held on the web server. The number of requests 
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made can be measured by using the web server log files. This will give a rough estimate as to 
how many times the RSS feed has been viewed. 

The problem with this method is that not all RSS feed aggregators will display images and 
parse HTML. 

Method 2 

3rd Party services - There are services available on the internet that will syndicate your RSS 
feed and then track all requests made to their syndication of your RSS feed. These services come 
free and paid forms. 

The problem with this method is that all analytical data about the feeds are controlled by the 
service provider and so not easily accessible or transferable. 

Method 3 

Unique URL per feed - This method requires heavy web server programming to auto 
generate a different RSS feed URL for each visitor to the website. The visitor's RSS feed activity 
can then be tracked accurately using standard web analytics applications. 

The problem with this method is that if the feed is syndicated by a search engine for instance 
then this will defeat the purpose of the unique URLs as many people could potentially view the 
RSS feed via a single URL. 


See also 


e RSS 
e Web analytics 


RSS TV 


RSS-TV is an XML-based navigation protocol for Internet Media services based on the RSS 
standard. 

The adoption of RSS-TV enables video device manufacturers to develop applications to 
navigate Internet media services. Example video devices include set top boxes, game consoles, 
broadband-connected digital video disc (DVD) players, digital video recorders (DVRs), personal 
video recorders (PVRs) and next generation mobile phones. By implementing the RSS TV 
protocol, these devices provide user access to a growing library of online media (video, audio 
and games) services. 

RSS-TV is an extension of RSS and includes additional XML elements and attributes to 
enable Premium TV-centric features such as: 

e Video On Demand (VOD) and Subscription Video On Demand (SVOD) 

e Navigating media services (video, audio, games) in a hierarchical fashion 

e Capturing user input (such as a user personal identification number (PIN), 
search query, or email) and automatic search suggestions. 

e Network PVR functionality 

e Secure download of HD content 

e Electronic Program Guide for live streams 

e Automatic language selection 

e Numeric shortcuts for selecting menu items 
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Readers familiar with digital video broadcasting (DVB) can compare RSS-TV with the DVB 
Service Information standards developed in the 1990s for digital TV EPGs. The difference is that 
RSS-TV has been developed for two-way Internet Protocol (IP) networks rather than 
broadcasting networks. RSS-TV leverages the increasing availability of products that support 
RSS such as caching engines and RSS-enabled content management and publishing systems. 

RSS-TV compliant applications can be implemented using any language and operating 
system including AJAX/HTML, Flash, OpenTV, or C. Similarly, service providers can use any 
web service technologies (Java, .NET, PHP) to build RSS-TV compliant services. 


Podcasting 

RSS-compliant feeds that use enclosures for video/audio (podcasting) are fully compliant 
with RSS-TV. RSS-TV compliant clients will display these feeds as a list of menu items and will 
play (or download) the media. 


RSS-TV Namespace 
RSS-TV uses the namespace 'tv'. 


Encoding 


Similar to other XML-based standards, RSS-TV documents are assumed to be 8-Bit Unicode 
Transformation Format (UTF-8) encoded. 


Yahoo! Pipes 


URL 


http://pipes.yahoo.com/ 


Commercial? 


Yes 


Type of site 


Registration 


Required 


Owner 


Yahoo! 
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Created by 


Yahoo! Pipes is a Web application from Yahoo! that provides a GUI-based interface for 
building applications that aggregate Web feeds and other services, creating Web-based apps 
from various sources, and publishing those apps. The site works by letting users "pipe" 
information from different sources and then set up rules for how that content should be modified 
(e.g. filtering). A typical example is New York Times thru Flickr, a pipe which takes the New 
York Times RSS feed and adds a photo from Flickr based on the keywords of each item. The site 
is currently in beta. 


FeedBurner 


FeedBurner is a news feed management provider launched in 2004." FeedBurner provides 
custom RSS feeds and management tools to bloggers, podcasters, and other web-based content 
publishers. Services provided to publishers include traffic analysis ”! and an optional advertising 
system. Though it initially was not clear whether advertising would be well-suited to the RSS 
format,"! authors now choose to include advertising in two-thirds of FeedBurner's feeds."! 
Published feeds are modified in several ways, including automatic links to Digg and del.icio.us, 
and "splicing" information from multiple feeds.[5] FeedBurner is a typical Web 2.0 service, 
providing web service application programming interfaces (APIs) to allow other software to 
interact with it. As of June 22, 2007, FeedBurner hosted 779,820 feeds for 454,969 publishers.'*! 

On June 3, 2007 FeedBurner was acquired by Google Inc., for a rumored price of $100 
million." 


Push technology 


Push technology on the Internet refers to a style of communication protocol where the 
request for a given transaction originates with the publisher, or central server. It is contrasted 
with pull technology, where the request for the transmission of information originates with the 
receiver, or client. 

Server push or webcasting is specifically related to the HTTP protocol, used on the World 
Wide Web. Typical World Wide Web usage is a pull operation - the end-user requests a web 
page using a web browser. 

Push transactions are often based on information preferences expressed in advance - a 
"subscription" model. A home computer user might "subscribe" to various information 
"channels". Whenever new content was available on one of those channels, the server would 
"push" that information out to the user. Such transactions are common within digital marketing 
channels such as the following: 

Instant messaging epitomizes push media. Messages and files are pushed to the user as soon 
as they are sent to the messaging service. Some peer-to-peer programs, such as WASTE, also 
allow pushing files. In these cases, the sender initiates the transfer rather than the recipient. 
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E-mail is also a classic Internet push medium; UUCP and SMTP are push protocols. 
However, the last step - from a user's "home" mail server to his desktop - is typically a "pull" 
operation using a pull protocol like POP3 or IMAP. Modern e-mail clients make this last step 
appear to be a "push" operation by making repeated "pull" requests - for example, by checking 
for new mail on an IMAP server every ten minutes. 

Most Web feeds, such as RSS, also appear to be push media, but technically are pulled by the 
user. With RSS, the user's aggregator polls the server periodically for new content; the server 
does not send information to the client unrequested. This continual polling is inefficient and has 
contributed to the shutdown or reduction of several popular RSS feeds that could not handle the 
bandwidth. In contrast, a "true" RSS-like push media would report new syndication items to each 
subscriber as soon as they were updated and would not require polling. However, "true" push 
media presents practical difficulties because the server is responsible for remembering who the 
subscribers are, and for remembering the latest address that each subscriber should be contacted 
at. 

Another type of push technology gained popularity in the 1990s using PointCast software. It 
received considerable media attention, and both Netscape and Microsoft integrated it heavily into 
their software at the height of the browser wars. However, most people did not find push 
technology useful, and it later faded into more obscure corners of software packages. 


HTTP Server Push 


The term server push was originally coined by Netscape in 1995. A special content type was 
added to HTTP called multipart/x-mixed-replace which the Netscape 1.1 and subsequent 
browsers would interpret as a document changing whenever the server felt like pushing a new 
version to the client. It is still supported by Mozilla, Firefox, Safari (but not in Safari on the 
iPhone) and Opera today, but traditionally ignored by Microsoft. It can be applied to HTML 
documents, making it quite popular for webchats, but also for streaming images in webcam 
applications. The Comet technique tries to emulate server-push with a lot of overhead in 
Javascript programming. 

The WHATWG is drafting a Web Applications 1.0 specification[1] which pushes content to 
the client. On September Ist, 2006, the Opera web browser implemented this new experimental 
technology in a feature called "Server-Sent Events."""12) 


Wireless Push Technology 


The original BlackBerry was the first popular example of push technology in a wireless 
context’), It automatically received short emails that were automatically sent to your device. 
Popular Wireless Push Technology has advanced since 1998, and has grown to include more 
complex information[4] All the new Windows Mobile (Windows Mobile 5 and 6) Versions use 
push technology to wirelessly send/receive emails, task and calender synchronization. 


Web feed 


A web feed is a data format used for serving users frequently updated content. Content 
distributors syndicate a web feed, thereby allowing users to subscribe to it. Making a collection 
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of web feeds accessible in one spot is known as aggregation, which is performed by an Internet 
aggregator. 

In the typical scenario of using web feeds, a content provider publishes a feed link on their 
site which end users can register with an aggregator program (also called a feed reader or a news 
reader) running on their own machines; doing this is usually as simple as dragging the link from 
the web browser to the aggregator. When instructed, the aggregator asks all the servers in its feed 
list if they have new content; if so, the aggregator either makes a note of the new content or 
downloads it. Aggregators can be scheduled to check for new content periodically. 

The kinds of content delivered by a web feed are typically HTML (webpage content) or links 
to webpages and other kinds of digital media. Often when websites provide web feeds to notify 
users of content updates, they only include summaries in the web feed rather than the full content 
itself. 

Web feeds are operated by many news websites, weblogs, schools, and podcasters. 


Benefits 


Web feeds also have some advantages compared to receiving frequently published content 
via email: 

e When subscribing to a feed, users do not disclose their email address, so users 
are not increasing their exposure to threats associated with email: spam, 
viruses, phishing, and identity theft. 

e Ifusers want to stop receiving news, they do not have to send an "unsubscribe" 
request; users can simply remove the feed from their aggregator. 

e The feed items are automatically "sorted" in the sense that each feed URL has its 
own sets of entries (unlike an email box, where all mails are in one big pile and 
email programs have to resort to complicated rules and pattern matching). 

A "Feed Reader" is required for using Web Feeds. This tool works like an automated e-mail 
program, but no e-mail address is needed. The user subscribes to a particular web feed, and 
thereafter receives updated contents, every time updating takes place. Feed Readers may be 
online (like a webmail account) or offline. Recently a number of mobile readers have arrived to 
the market. An offline web feed is downloaded to the user's system. Feed readers are used in 
personalized home page services like My Google or My Yahoo or My MSN to put content such 
as news, weather and stock quotes appear on the user’s personal page. Content from other sites 
can also be added to that personalized page, again using feeds. Organizations can use a web feed 
server behind their firewall to distribute, manage and track the use of internal and external web 
feeds by users and groups. Other web-based tools are primarily dedicated to feed-reading only. 
One of the most popular web-based feed readers at this point is Bloglines, which is also free. 
Firefox, Internet Explorer 7.0, and many other web browsers allow receipts of feeds from the 
tool bar using Live Bookmarks, Favorites, and other techniques to integrate feed reading into a 
browser. Finally, there are desktop-based feed readers, e.g. Newsgator and Feed Demon. These 
are like email programs for web 


Scraping 
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The usual way is that a web feed is made available by the same entity that created the 
content. Typically the feed comes from the same place as the website. However not all websites 
provide a feed. Sometimes third parties will read the website and create a feed for it by scraping 
it. 

Sometimes this is done by tools that are hand crafted for each particular website. There are 
also automatic tools such as IrisFeed, feedmaker, Feedity, Feed43, Feedyes, Page2RSS, 
WEB2RSS. 

Scraping is controversial since it distributes the content in a manner that was not chosen by 
the content owner. 


Technical definition 


A web feed is a document (often XML-based) which contains content items with web links 
to longer versions. News websites and blogs are common sources for web feeds, but feeds are 
also used to deliver structured information ranging from weather data to "top ten" lists of hit 
tunes to search results. The two main web feed formats are RSS and Atom. 

"Publishing a feed" and "syndication" are two of the more common terms used to describe 
making available a feed for an information source, such as a blog. Like syndicated print 
newspaper features or broadcast programs, web feed contents may be shared and republished by 
other websites. (For that reason, one popular definition of RSS is Really Simple Syndication.) 

More often, feeds are subscribed to directly by users with aggregators or feed readers, which 
combine the contents of multiple web feeds for display on a single screen or series of screens. 
Some modern web browsers incorporate aggregator features. Depending on the aggregator, users 
typically subscribe to a feed by manually entering the URL of a feed or clicking a link in a web 
browser. 

Web feeds are designed to be machine-readable rather than human-readable, which tends to 
be a source of confusion when people first encounter web feeds. This means that web feeds can 
also be used to automatically transfer information from one website to another, without any 
human intervention. 


See also 


e Atom (standard) 
e RSS (file format) 
e Web syndication 
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Web technology 


Web analytics | Web application | Web service | FastCGI | .htaccess | Hypervideo | Mirror | 
Website monitoring 


Web analytics 


Web analytics is the study of the behaviour of website visitors. In a commercial context, 
web analytics especially refers to the use of data collected from a web site to determine which 
aspects of the website work towards the business objectives; for example, which landing pages 
encourage people to make a purchase. 

Data collected almost always includes web traffic reports. It may also include e-mail 
response rates, direct mail campaign data, sales and lead information, user performance data such 
as click heat mapping, or other custom metrics as needed. This data is typically compared against 
key performance indicators for performance, and used to improve a web site or marketing 
campaign's audience response. 

Many different vendors provide web analytics software and services. 


Web analytics technologies 


There are two main technological approaches to collecting web analytics data. The first 
method, logfile analysis, reads the logfiles in which the web server records all its transactions. 
The second method, page tagging, uses JavaScript on each page to notify a third-party server 
when a page is rendered by a web browser. 


Web server logfile analysis 


Web servers have always recorded all their transactions in a logfile. It was soon realised that 
these logfiles could be read by a program to provide data on the popularity of the website. Thus 
arose web log analysis software. 

In the early 1990s, web site statistics consisted primarily of counting the number of client 
requests made to the web server. This was a reasonable method initially, since each web site 
often consisted of a single HTML file. However, with the introduction of images in HTML, and 
web sites that spanned multiple HTML files, this count became less useful. The first true 
commercial Log Analyzer was released by IPRO in 1994/1]. 

Two units of measure were introduced in the mid 1990s to gauge more accurately the amount 
of human activity on web servers. These were page views and visits (or sessions). A page view 
was defined as a request made to the web server for a page, as opposed to a graphic, while a visit 
was defined as a sequence of requests from a uniquely identified client that expired after a 
certain amount of inactivity, usually 30 minutes. The page views and visits are still commonly 
displayed metrics, but are now considered rather unsophisticated measurements. 
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The emergence of search engine spiders and robots in the late 1990s, along with web proxies 
and dynamically assigned IP addresses for large companies and ISPs, made it more difficult to 
identify unique human visitors to a website. Log analyzers responded by tracking visits by 
cookies, and by ignoring requests from known spiders. 

The extensive use of web caches also presented a problem for logfile analysis. If a person 
revisits a page, the second request will often be retrieved from the browser's cache, and so no 
request will be received by the web server. This means that the person's path through the site is 
lost. Caching can be defeated by configuring the web server, but this can result in degraded 
performance for the visitor to the website. 


Page tagging 


Concerns about the accuracy of logfile analysis in the presence of caching, and the desire to 
be able to perform web analytics as an outsourced service, led to the second data collection 
method, page tagging or Web bugs'. 

In the mid 1990s, Web counters were commonly seen — these were images included in a 
web page that showed the number of times the image had been requested, which was an estimate 
of the number of visits to that page. In the late 1990s this concept evolved to include a small 
invisible image instead of a visible one, and, by using JavaScript, to pass along with the image 
request certain information about the page and the visitor. This information can then be 
processed remotely by a web analytics company, and extensive statistics generated. 

The web analytics service also manages the process of assigning a cookie to the user, which 
can uniquely identify them during their visit and in subsequent visits. 

With the increasingly popularity of Ajax-based solutions, an alternative to the use of an 
invisible image, is to implement a call back to the server from the rendered page. In this case, 
when the page is rendered on the web browser, a piece of Ajax code would call back to the 
server and pass information about the client that can then be aggregated by a web analytics 
company. 


Logfile analysis vs page tagging 


Both logfile analysis programs and page tagging solutions are readily available to companies 
that wish to perform web analytics. In many cases, the same web analytics company will offer 
both approaches. The question then arises of which method a company should choose. There are 
advantages and disadvantages to each approach. 

Advantages of logfile analysis 

The main advantages of logfile analysis over page tagging are as follows. 

e The web server normally already produces logfiles, so the raw data is already 
available. To collect data via page tagging requires changes to the website. 

e The web server reliably records every transaction it makes. Page tagging relies 
on the visitors’ browsers co-operating, which a certain proportion may not do 
(for example, if JavaScript is disabled). 

e The data is on the company's own servers, and is in a standard, rather than a 
proprietary, format. This makes it easy for a company to switch programs later, 
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use several different programs, and analyze historical data with a new program. 
Page tagging solutions involve vendor lock-in. 

e Logfiles contain information on visits from search engine spiders. Although 
these should not be reported as part of the human activity, it is important data 
for performing search engine optimization. 

e Logfiles contain information on failed requests; page tagging only records an 
event if the page is successfully viewed. 

Advantages of page tagging 
The main advantages of page tagging over logfile analysis are as follows. 

e The JavaScript is automatically run every time the page is loaded. Thus there are 
fewer worries about caching. 

e Itis easier to add additional information to the JavaScript, which can then be 
collected by the remote server. For example, information about the visitors’ 
screen sizes, or the price of the goods they purchased, can be added in this way. 
With logfile analysis, information not normally collected by the web server can 
only be recorded by modifying the URL. 

e Page tagging can report on events which do not involve a request to the web 
server, such as interactions within Flash movies. 

e The page tagging service manages the process of assigning cookies to visitors; 
with logfile analysis, the server has to be configured to do this. 

e Page tagging is available to companies who do not run their own web servers. 

Economic factors 

Logfile analysis is almost always performed in-house. Page tagging can be performed in- 
house, but it is more often provided as a third-party service. The economic difference between 
these two models can also be a consideration for a company deciding which to purchase. 

e Logfile analysis typically involves a one-off software purchase; however, some 
vendors are introducing maximum annual page views with additional costs to 
process additional information. 

e Page tagging most often involves a monthly fee, although some vendors offer 
installable page tagging solutions with no additional page view costs. 

Which solution is cheaper often depends on the amount of technical expertise within the 
company, the vendor chosen, the amount of activity seen on the web sites, the depth and type of 
information sought, and the number of distinct web sites needing statistics. 


Hybrid methods 


Some companies are now producing programs which collect data through both logfiles and 
page tagging. By using a hybrid method, they aim to produce more accurate statistics than either 
method on its own. The first Hybrid solution was produced in 1998 by Rufus Evison who then 
spun the product out to create a company based upon the increased accuracy of hybrid methods 


Other methods 
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Other methods of data collection have been used, but are not currently widely deployed. 
These include integrating the web analytics program into the web server, and collecting data by 
sniffing the network traffic passing between the web server and the outside world. 

There is also another method of the page tagging analysis. Instead of getting the information 
from the user side, when he / she opens the page, it’s also possible to let the script work on the 
server side. Right before a page is sent to a user it then sends the data. 


Key definitions 


There are no globally agreed definitions within web analytics as the industry bodies have 
been trying to agree definitions that are useful and definitive for some time. The main bodies 
who have had input in this area have been Jicwebs(Industry Committee for Web 
Standards)/ABCe (Auditing Bureau of Circulations electronic, UK and Europe), The WAA 
(Web Analytics Association, US) and to a lesser extent the IAB (Interactive Advertising 
Bureau). This does not prevent the following list from being a useful guide, suffering only 
slightly from ambiguity. Both the WAA and the ABCe provide more definitive lists for those 
who are declaring their statistics using the metrics defined by either. 

e Hit - A request for a file from the web server. Available only in log analysis. The 
number of hits received by a website is frequently cited to assert its popularity, 
but this number is extremely misleading and dramatically over-estimates 
popularity. A single web-page typically consists of multiple (often dozens) of 
discrete files, each of which is counted as a hit as the page is downloaded, so the 
number of hits is really an arbitrary number more reflective of the complexity of 
individual pages on the website than the website's actual popularity. The total 
number of visitors or page views provides a more realistic and accurate 
assessment of popularity. 

e Page View - A request for a file whose type is defined as a page in log analysis. 
An occurrence of the script being run in page tagging. In log analysis, a single 
page view may generate multiple hits as all the resources required to view the 
page (images, .js and .css files) are also requested from the web server. 

e Visit / Session - A series of requests from the same uniquely identified client 
with a set timeout. A visit is expected to contain multiple hits (in log analysis) 
and page views. 

e First Visit / First Session - A visit from a visitor who has not made any 
previous visits. 

e Visitor / Unique Visitor/UniqueUser - The uniquely identified client 
generating requests on the web server (log analysis) or viewing pages (page 
tagging) within a defined time period (i.e. day, week or month). A Unique Visitor 
counts once within the timescale. A visitor can make multiple visits. N.B. The 
Unique User is now the only mandatory metric for an ABCe audit. 

e Repeat Visitor - A visitor that has made at least one previous visit. The period 
between the last and current visit is called visitor recency and is measured in 
days. 


369 


NICOLAE SFETCU: WEB DESIGN & DEVELOPMENT 


e New Visitor - A visitor that has not made any previous visits. This definition 
creates a certain amount of confusion (see common confusions below), and is 
sometimes substituted with analysis of first visits. 

e Impression - An impression is each time an advertisement loads on a users 
screen. Anytime you see a banner, that is an impression. 

e Singletons - The number of visits where only a single page is viewed. While not 
a useful metric in and of itself the number of singletons is indicative of various 
forms of "Click Fraud" as well as being used to calculate bounce rate and in 
some cases to identify automatons ("bots"). 

e Bounce Rate / % Exit - The percentage of visits where the visitor enters and 
exits at the same page without visiting any other pages on the site in between. 


Common Confusions in Web Analytics 
The Hotel Problem 


The hotel problem is generally the first problem encountered by a user of web analytics. The 
term was first coined by Rufus Evison explaining the problem at one of the emetrics summits 
and has now gained popularity as a simple expression of the problem and its resolution. 

The problem is that the unique visitors for each day in a month do not add up to the same 
total as the unique visitors for that month. This appears to an inexperienced user to be a problem 
in whatever analytics software they are using. In fact it is a simple property of the metric 
definitions. 

The way to picture the situation is by imagining a hotel. The hotel has two rooms (Room A 
and Room B). 

Day 1 Day 2 Day 3 Total 
Room AJohn John Jane 2 Unique Users 
Room BJean Jane Jean 2 Unique Users 
Total 2 pi 2 ? 

As the table shows, the hotel has two unique users each day over three days. The sum of the 
totals with respect to the days is therefore six. 

During the period each room has had two unique users. The sum of the totals with respect to 
the rooms is therefore four. 

In actual fact only three visitors have been in the hotel over this period. The problem is that a 
person who stays in a room for two nights will get counted twice if you count them once on each 
day, but is only counted once if you are looking at the total for the period. Any software for web 
analytics will sum these correctly for whatever time period, thus leading to the problem when a 
user tries to compare the totals. 


New Visitors + Repeat Visitors unequal to Total Visitors 


Another common misconception in web analytics is that the sum of the new visitors and the 
repeat visitors ought to be the total number of visitors. Again this becomes clear if the visitors 
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are viewed as individuals on a small scale, but still causes a large number of complaints that 
analytics software cannot be working because of a failure to understand the metrics. 

Here the culprit is the metric of a new visitor. There is really no such thing as a new visitor 
when you are considering a web site from an ongoing perspective. If a visitor makes their first 
visit on a given day and then returns to the web site on the same day they are both a new visitor 
and a repeat visitor for that day. So if we look at them as an individual which are they? The 
answer has to be both, so the definition of the metric is at fault. 

A new visitor is not an individual it is a fact of the web measurement. For this reason it is 
easiest to conceptualise the same facet as a first visit (or first session). This resolves the conflict 
and so removes the confusion. Nobody expects the number of first visits to add to the number of 
repeat visitors to give the total number of visitors. The metric will have the same number as the 
new visitors, but it is clearer that it will not add in this fashion. 

On the day in question there was a first visit made by our chosen individual. There was also a 
repeat visit made by the same individual. The number of first visits and the number of repeat 
visits will add up to the total number of visits for that day. 


Web analytics methods 
Problems with cookies 


Historically, vendors of page-tagging analytics solutions have used third-party cookies, that 
is cookies sent from the vendor's domain instead of the domain of the website being browsed. 
Third-party cookies can handle visitors who cross multiple unrelated domains within the 
company's site, since the cookie is always handled by the vendor's servers. 

However, third-party cookies in principle allow tracking an individual user across the sites of 
different companies, allowing the analytics vendor to collate the user's activity on sites where he 
provided personal information with his activity on other sites where he thought he was 
anonymous. Although web analytics companies deny doing this, other companies such as 
companies supplying banner ads have done so. Privacy concerns about cookies have therefore 
led a noticeable minority of users to block or delete third-party cookies. In 2005, some reports 
showed that about 28% of Internet users blocked third-party cookies and 22% deleted them at 
least once a month [4]. 

Most vendors of page tagging solutions have now moved to provide at least the option of 
using first-party cookies (cookies assigned from the client subdomain). 

Another problem is cookie deletion. When web analytics depend on cookies to identify 
unique visitors, the statistics are dependent on a persistent cookie to hold a unique visitor ID. 
When users delete cookies, they usually delete both first- and third-party cookies. If this is done 
between interactions with the site, the user will appear as a first-time visitor at their next 
interaction point. Without a persistent and unique visitor id, conversions, click-stream analysis, 
and other metrics dependent on the activities of a unique visitor over time, cannot be accurate. 

Cookies are used because IP addresses are not always unique to users and may be shared by 
large groups or proxies. Other methods of uniquely identifying a user are technically challenging 
and would limit the trackable audience or would be considered suspicious. Cookies are the 
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selected option because they reach the lowest common denominator without using technologies 
regarded as spyware. 


Unique landing pages vs referrals for campaign tracking 


Tracking the amount of activity generated through advertising relationships with external 
web sites through the referrals reports available in most web analytics packages is significantly 
less accurate than using unique landing pages. 

Referring URLs are an unreliable source of information for the following reasons: 

e They may or may not be provided by the web browser. 

e They may or may not be recorded by the web server. 

e They can be obfuscated intentionally by web browsers that wish to browse 
anonymously. 

e They can be distorted or hidden by redirects, intentionally or not. 


Bounce Rate 


Bounce Rate 


Bounce Rate (also called % Exit) is a term used in website traffic analysis. A Bounce occurs 
when a website visitor leaves a page or a site without visiting any other pages before a certain 
session timeout elapses. It is important to note that there is no standard minimum or maximum 
time limit a visitor must leave by in order for a bounce to occur. Rather, this is determined by the 
session timeout of the analytics tracking software. A commonly used session timeout value is 30 
minutes. In this case, if a visitor views a page and leaves his browser idle for 31 minutes, they 
will register as a "Bounce". If they then continue to navigate after this delay, a new session will 
occur and the last page they view before exiting or timing out again will result in another 
"Bounce". Thus, it is important to note the dependency between bounces and sessions. 

The Bounce Rate for a single page is the number of people who visit a page and leave within 
the specified timeout period, divided by the total number of people who have visited the page. In 
contrast, the Bounce Rate for a website is the number of web site visitors who visit only a single 
page of a website per session divided by the total number of website visitors. 

Bounce rates can be used to help determine the effectiveness or performance of an entry 
page. An entry page with a low bounce rate means that the page effectively causes visitors to 
view more pages and continue on deeper into the website. !! 

Avinash Kaushik, Google's Analytics Evangelist states, "it is really hard to get a bounce rate 
under 20%, anything over 35% is cause for concern, 50% (above) is worrying." ”! 

It is also important to consider that bounce rates mean different things for different pages on 
a website. For example, if the page that you are investigating is the final step of a process 
(purchasing a product, registering for information) having a high bounce rate should not be 
unexpected. On the contrary, if the page that you are analyzing makes up the first step of the 
process (viewing a product), then a high bounce rate might be cause for concern. 
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See also 


e Web analytics 


Web application 


In software engineering, a Web application or webapp is an application that is accessed via 
web over a network such as the Internet or an intranet. 

Web applications are popular due to the ubiquity of a client, sometimes called a thin client. 
The ability to update and maintain Web applications without distributing and installing software 
on potentially thousands of client computers is a key reason for their popularity. Web 
applications are used to implement Webmail, online retail sales, online auctions, wikis, 
discussion boards, Weblogs, MMORPGs and many other functions. 


History 


In earlier types of client-server computing, each application had its own client program 
which served as its user interface and had to be separately installed on each user's personal 
computer. An upgrade to the server part of the application would typically require an upgrade to 
the clients installed on each user workstation, adding to the support cost and decreasing 
productivity. 

In contrast, Web applications dynamically generate a series of Web documents in a standard 
format supported by common browsers such as HTML/XHTML. Client-side scripting in a 
standard language such as JavaScript is commonly included to add dynamic elements to the user 
interface. Generally, each individual Web page is delivered to the client as a static document, but 
the sequence of pages can provide an interactive experience, as user input is returned through 
Web form elements embedded in the page markup. During the session, the Web browser 
interprets and displays the pages, and acts as the universal client for any Web application. 


Interface 


The Web interface places very few limits on client functionality. Through Java, JavaScript, 
DHTML, Flash and other technologies, application-specific methods such as drawing on the 
screen, playing audio, and access to the keyboard and mouse are all possible. General purpose 
techniques such as drag and drop are also supported by these technologies. Web developers often 
use client-side scripting to add functionality, especially to create an interactive experience that 
does not require page reloading (which many users find disruptive). Recently, technologies have 
been developed to coordinate client-side scripting with server-side technologies such as PHP. 
Ajax, a web development technique using a combination of various technologies, is an example 
of technology which creates a more interactive experience. 


Technical considerations 
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A significant advantage of building Web applications to support standard browser features is 
that they should perform as specified regardless of the operating system or OS version installed 
on a given client. Rather than creating clients for MS Windows, Mac OS X, GNU/Linux, and 
other operating systems, the application can be written once and deployed almost anywhere. 
However, inconsistent implementations of the HTML, CSS, DOM and other browser 
specifications can cause problems in web application development and support. Additionally, the 
ability of users to customize many of the display settings of their browser (such as selecting 
different font sizes, colors, and typefaces, or disabling scripting support) can interfere with 
consistent implementation of a Web application. 

Another (less common) approach is to use Adobe Flash or Java applets to provide some or all 
of the user interface. Since most Web browsers include support for these technologies (usually 
through plug-ins), Flash- or Java-based applications can be implemented with much of the same 
ease of deployment. Because they allow the programmer greater control over the interface, they 
bypass many browser-configuration issues, although incompatibilities between Java or Flash 
implementations on the client can introduce different complications. Because of their 
architectural similarities to traditional client-server applications, with a somewhat "thick" client, 
there is some dispute over whether to call systems of this sort "Web applications"; an alternative 
term is "Rich Internet Application". 


Structure 


Though many variations are possible, a Web application is commonly structured as a three- 
tiered application. In its most common form, a Web browser is the first tier, an engine using 
some dynamic Web content technology (such as ASP, ASP.NET, CGI, ColdFusion, JSP/Java, 
PHP, Python, or Ruby On Rails) is the middle tier, and a database is the third tier. The Web 
browser sends requests to the middle tier, which services them by making queries and updates 
against the database and generates a user interface. 


Business use 


An emerging strategy for application software companies is to provide Web access to 
software previously distributed as local applications. Depending on the type of application, it 
may require the development of an entirely different browser-based interface, or merely adapting 
an existing application to use different presentation technology. These programs allow the user to 
pay a monthly or yearly fee for use of a software application without having to install it on a 
local hard drive. A company which follows this strategy is known as an application service 
provider (ASP), and ASPs are currently receiving much attention in the software industry. 


Writing Web applications 


There are many Web application frameworks which facilitate rapid application development 
by allowing the programmer to define a high-level description of the program. In addition, there 
is potential for the development of applications on Internet Operating Systems, although 
currently there are not many viable platforms that fit this model. 
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The use of Web application frameworks can often reduce the number of errors in a program, 
both by making the code more simple, and by allowing one team to concentrate just on the 
framework. In applications which are exposed to constant hacking attempts on the Internet, 
security-related problems caused by errors in the program are a big issue. Frameworks may also 
promote the use of best practices such as GET after POST 

The Web Application Security Consortium (WASC), CGI Security, and OWASP are projects 
developed with the intention of documenting how to avoid security problems in Web 
applications. 


See also 
e AJAX 
e CSS 
e Web application framework 
e Web services 


e List of web application frameworks 
e World Wide Web Consortium (Web standards) 


Online spreadsheets | Rich Internet Applications | Web application framework | e-Recruitment | 
Electronic business | Electronic commerce | Freight exchange | Mashup | Online banking | Online 
gambling | Online Office | Shop bot | Web application development | Web-based email 


Online spreadsheets 


This is a list of online spreadsheets. Online spreadsheets are online applications that allow 
users to create a spreadsheet, have multiple persons edit it and share it with the world. 


e EditGrid - Free unsecured personal accounts. Premium account for security or 
organizations. 
Google Docs & Spreadsheets - Free 
Simple Spreadsheet - Free and open source 
ThinkFree Calc - Free and Premium services 
wikiCalc - Free, download application for online use 
Zoho Office Suite - Free and Pro accounts 


Rich Internet Applications 


This is a list of rich Internet applications. They are organised by their use. 


Communication 
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Gmail - Webmail client for Google's email service 
Meebo - An instant messaging client for multiple platforms 


Outlook Web Access - The webmail client used in Microsoft's Exchange Server 


RoundCube - Webmail Client Project 
Yahoo! Mail (beta) - A webmail client for Yahoo!'s email service 
Windows Live Hotmail - Webmail client for Microsoft's email service 


Document managing, editing, sharing and collaborating 


ERP 


EditGrid - A spreadsheet application 

Flickr - A photo sharing service owned by Yahoo! 

FORscene - An online video editor 

Google Docs & Spreadsheets - A document and spreadsheet manager with 
multiple file support and online storage 

Wrike - An online collaboration tool transparently integrated with e-mail 
Zoho Office Suite - An online office suite 

Cyclone3 XULadmin - content management system based on XUL 


37 signals - a suite of online collaborating applications targeted mainly at 
business 


24SevenOffice - AERP/CRM application 


Personal information management 


30 Boxes - An online calendar 

Google Calendar - An online calendar 
netvibes - A personalised start page with support for XML, iCalendar and RSS 
Windows Live Favourites - A favourite links manager 
Zimbra - An online PIM with Mail, Calendar and more 


Web application framework 


A web application framework is a software framework that is designed to support the 
development of dynamic websites, Web applications and Web services. The framework aims to 
alleviate the overhead associated with common activities used in Web development. For 
example, many frameworks provide libraries for database access, templating frameworks and 


session management, and often promote code reuse. 


History 


Common Gateway Interface 
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As the design of the World Wide Web was not inherently dynamic, early hypertext consisted 
of hand-coded HTML that was published on web servers. Any modifications to published pages 
needed to be performed by the pages' author. To provide a dynamic web page that reflected user 
inputs, the Common Gateway Interface (CGI) standard was introduced for interfacing external 
applications with web servers."'! CGI could adversely affect server load, though, since each 
request had to start a separate process. 


Tighter integration 


Programmers wanted tighter integration with the web server to enable high traffic web 
applications. The Apache HTTP Server, for example, supports modules that can extend the web 
server with arbitrary code executions (such as mod_python) or forward specific requests to a web 
server that can handle dynamic content (such as mod_jk). Some web servers (such as Apache 
Tomcat) were specifically designed to handle dynamic content by executing code written in 
some languages, such as Java. 


Web languages 


Around the same time, new languages were being developed specifically for use in the web, 
such as PHP and Active Server Pages. 


Web libraries 


While the vast majority of languages available to programmers to use in creating dynamic 
web pages have libraries to help with common tasks, web applications often require specific 
libraries that are useful in web applications, such as creating HTML (for example, JavaServer 
Faces). 


Full Stack 


Eventually, mature, "full stack" frameworks appeared, that often gathered multiple libraries 
useful for web development into a single cohesive software stack for web developers to use. 


Architectures 
Model view controller 


Many frameworks follow the Model View Controller (MVC) architectural pattern to separate 
the data model, business rules and user interface. 
Push-based vs. Pull-based 

Most MVC frameworks follow a push-based architecture. These frameworks use actions that 
do the required processing, and then "push" the data to the view layer to render the results.) 
Struts and Ruby on Rails are good examples of this architecture. 
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An alternative to this is pull-based architecture, sometimes also called "component-based". 
These frameworks start with the view layer, which can then "pull" results from multiple 
controllers as needed. In this architecture, multiple controllers can be involved with a single 
view. Tapestry and JBoss Seam are examples of pull-based architectures. 


Features 
Security 


Some web application frameworks come with authentication and authorization frameworks, 
that enable the web server to identify the users of the application, and restrict access to functions 
based on some defined criteria. Django is one example that provides role-based access to pages, 
and provides a web-based interface for creating users and assigning them roles. 


Database access and mapping 


Many web application frameworks create a unified API to a database backend, enabling web 
applications to work with a variety of databases with no code changes, and allowing 
programmers to work with higher-level concepts. Additionally, some object-oriented 
frameworks contain mapping tools to provide Object-Relational Mapping, which will map 
objects to tuples. 

Other features web application frameworks may provide include transactional support and 
database migration tools. 


URL mapping 


By automatically rewriting a url with parameters to a friendly URL, the system becomes 
easier to use, and as an additional benefit, is better indexed by search engines. An example 
would be the address ending in ?cat=l&pageid=3 to /category/science/topic/physics or just 
/science/physics. When the id of the category changes the url can stay the same (hence the 
advantage for search engines). Rewriting URL's can help make an application better conform to 
some elements of RESTful design practices. 


Web template system 


Dynamic web pages usually consist of a static part (HTML) and a dynamic part, which is 
code that generates HTML. The code that generates the HTML can do this based on variables in 
a template, or on code. The text to be generated can come from a database, thereby making it 
possible to dramatically reduce the number of pages in a site. 

Consider the example of a real estate agent with 500 houses for sale. In a static web site, the 
agent would have to create 500 pages in order to make the information available. In a dynamic 
website, the agent would simply connect the dynamic page to a database table of 500 records. 

In a template, variables from the programming language can be inserted without using code, 
thereby losing the requirement of programming knowledge to make updates to the pages in a 
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web site. A syntax is made available to distinguish between HTML and variables. E.g. in JSP the 
<c:out> tag is used to output variables, and in Smarty, {$variable} is used. 

Many template engines do support limited logic tags, like IF and FOREACH. These are to be 
used only for decisions that need to be made for the presentation layer, in order to keep a clean 
separation from the business logic layer, or the M(odel) in the MVC pattern. 

Some frameworks will generate a basic, but fully-functional, application, once the developer 
specifies the structure of the data, or it is discovered through introspection. In some frameworks, 
this scaffolding supports the create, read, update and delete (CRUD) operations necessary to 
manipulate data. 


Caching 


Web caching is the caching of web documents in order to reduce bandwidth usage, server 
load, and perceived "lag". A web cache stores copies of documents passing through it; 
subsequent requests may be satisfied from the cache if certain conditions are met. Some 
application frameworks provide mechanisms for caching documents and bypassing the web 
template system. 


Ajax 


Ajax, shorthand for "Asynchronous JavaScript and XML", is a web development technique 
for creating interactive web applications. The intent is to make web pages feel more responsive 
by exchanging small amounts of data with the server behind the scenes, so that the entire web 
page does not have to be reloaded each time the user requests a change. This is intended to 
increase the web page's interactivity, speed, and usability. 

Due to complexity of Ajax programming, there are numerous Ajax frameworks that 
exclusively deal with Ajax support. Some Ajax frameworks are even embedded as a part of 
larger frameworks. For example, the Prototype JavaScript Framework is included in Ruby on 
Rails. 


Automatic configuration 


Some frameworks minimize web application configuration through the use of introspection 
and/or following known conventions. For example, many Java frameworks use Hibernate as a 
persistence layer, which can generate a database schema at runtime capable of persisting the 
necessary information. This allows the application designer to design business objects without 
needing to explicitly define a database schema. Frameworks such as Ruby on Rails can also 
work in reverse, that is, define properties of model objects at runtime based on a database 
schema. 


Web services 


Main article: Web services 
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Some frameworks provide tools for creating and providing web services. These utilities may 
offer similar tools as the rest of the web application. 


Technologies 


See also: List of web application frameworks 
Languages 


Many languages have an associate web application framework. However, certain languages 
either have a critical mass of developers to give a higher level of support to frameworks, or 
provide features that prove conducive to the development of web application frameworks 
Java 

There are numerous Java frameworks either in development or in use. Many of these 
frameworks are built on top of, or borrow elements from the Java EE platform. 

ASP.NET 

For Microsoft's ASP.NET platform there is the DotNetNuke web application framework. 
Since ASP.NET itself is a technology designed for building web apps it's often wrongly referred 
to as a web application framework. 

PHP 

PHP's original design for dynamic web pages has given support to projects such as CakePHP, 
symfony, Zoop_Framework, the eZ publish web publishing framework and the Zend 
Framework. 

Perl, Python and Ruby 

There are numerous dynamic language frameworks. Perl has Maypole and Catalyst. Python 
has Django, TurboGears, pylons, Quixote, and Karrigell. Ruby has Nitro and Ruby on Rails, a 
particularly well known framework. 

Smalltalk 

Seaside is an open source web application framework for developing web applications in 
Smalltalk. Although the main development of Seaside happens in Squeak there exist ports for 
other Smalltalk dialects. 

JavaScript 

Helma is an open source web application framework / publishing system written in Java 

which uses Javascript as programming language. 


Operating Systems 


With very few exceptions, web application frameworks are based upon platform independent 
languages that run on a variety of platforms. While some frameworks may recommend particular 
configurations, most can run on Windows, Linux, Mac and other Unix-based platforms. A 
notable exception is DotNetNuke, written for the .NET Framework, that does not support the 
Mono runtime. 
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See also 


e List of web application frameworks 
e JavaScript library 


List of web application frameworks 


List of web application frameworks 


This is a list of notable web application frameworks, used for creating web applications. 


ColdFusion 
e ColdSpring 
Fusebox 
Mach-II 
Model-Glue 


Java 


e Apache Cocoon 
Apache Struts 
AppFuse 
Aranea framework 
Google Web Toolkit 
Grails 
Hamlets 
JavaServer Faces 
JBoss Seam 
OpenLaszlo 
OpenXava 
Reasonable Server Faces (RSF) 
RIFE 
Shale Framework (software) 
SmartClient 
Spring Framework 
Stripes (framework) 
Tapestry 
ThinWire 
WebObjects 
WebWork 
Wicket framework 
ZK Framework 
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JavaScript 


See also: JavaScript library 
Client-side 


e AJILE 
Clean AJAX 
Dojo Toolkit 
Echo 
Ext 
JQuery 
Microsoft AJAX Library 
Mochikit 
Mootools 
Prototype JavaScript Framework 
qooxdoo 
Rialto Toolkit 
Rico 
Script.aculo.us 
SmartClient 
Spry framework 
Yahoo! UI Library 


Server-side 
e Helma Object Publisher 


Perl 


e Catalyst 
Interchange 
Maypole 
Mason 


PHP 


e BlueShoes 
CakePHP 
Canvas Framework 
Codelgniter 
DIY Framework 
FUSE 
Horde 
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PHP For Applications 
PHPOpenbiz 

PRADO 

Qcodo 

Seagull PHP Framework 
Symfony 

Xaraya 

Zend Framework 

Zoop Framework 


CherryPy 
Django 
Karrigell 
Nevow 
Porcupine 
Pylons 
Spyce 
TurboGears 
TwistedWeb 
Webware 
Zope 


Camping (microframework) 
Nitro 
Ruby on Rails 


Aida/Web 
Seaside 


Other/Multiple languages 


Alpha Five 

DotNetNuke (VB.NET) 

Fusebox (ColdFusion and PHP) 
OpenACS (Tcl) 

UnCommon Web (Common Lisp) 
Yaws (Erlang) 

Magic (Scheme) 
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e-Recruitment 


E-Recruitment covers a range of Web-based application tools used for the provisioning 
(typically) of human resources. These applications assist in the recruitment of suitable candidates 
for vacant positions. Some applications do this by semi-automating the entire recruitment and 
hiring process. E-recruitment applications (or software packages that are web-enabled) typically 
enable recruitment teams to create job postings, manage job application responses, schedule 
interviews and manage other recruitment tasks. This dramatically reduces the labour and money 
spent on physical recruitment. 

E-Recruiting or electronic recruiting is the process of using internet based software to attract, 
screen and recruit suitable job candidates. E-Recruiting reduces the cost of the recruitment 
process, reduces the time taken to identify appropriate candidates and helps organizations 
improve the quality and quantity of the applicant pool. 


Electronic business 


Electronic Business, or "e-Business", may be defined broadly as any business process that 
relies on an automated information system. Today, this is mostly done with Web-based 
technologies. The term "e-Business" was coined by Lou Gerstner, CEO of IBM. 

Electronic business methods enable companies to link their internal and external data 
processing systems more efficiently and flexibly, to work more closely with suppliers and 
partners, and to better satisfy the needs and expectations of their customers. 

In practice, e-business is more than just e-commerce. While e-business refers to more 
strategic focus with an emphasis on the functions that occur using electronic capabilities, e- 
commerce is a subset of an overall e-business strategy. E-commerce seeks to add revenue 
streams using the World Wide Web or the Internet to build and enhance relationships with 
clients and partners and to improve efficiency using the Empty Vessel strategy. Often, e- 
commerce involves the application of knowledge management systems. 

E-business involves business processes spanning the entire value chain: electronic purchasing 
and supply chain management, processing orders electronically, handling customer service, and 
cooperating with business partners. Special technical standards for e-business facilitate the 
exchange of data between companies. E-business software solutions allow the integration of intra 
and inter firm business processes. E-business can be conducted using the Web, the Internet, 
intranets, extranets, or some combination of these. 


Subsets 


Applications can be divided into three categories: 
1. Internal business systems: 
o customer relationship management 
o enterprise resource planning 
o document management systems 
o human resources management 
2. Enterprise communication and collaboration: 
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VoIP 
content management system 
e-mail 
voice mail 
Web conferencing 
3. electronic commerce - business-to-business electronic commerce (B2B) or 
business-to-consumer electronic commerce (B2C): 
o internet shop 
o supply chain management 
o online marketing 


OOO 0 0 


Models 


When organizations go online, they have to decide which e-business models best suit their 
goals. "! A business model is defended as the organization of product, service and information 
flows, and the source of revenues and benefits for suppliers and customers. The concept of e- 
business model is the same but used in the online presence. The following is a list of the 
currently most adopted e-business models: 

e E-shops 
E-procurement 
E-malls 
E-auctions 
Virtual Communities 
Collaboration Platforms 
Third-party Marketplaces 
Value-chain Integrators 
Value-chain Service Providers 
Information Brokerage 


Classification by provider and consumer 


Roughly dividing the world into providers/producers and consumers/clients one can classify 
e-businesses into the following categories: 
e business-to-business (B2B) 
business-to-consumer (B2C) 
business-to-employee (B2E) 
business-to-government (B2G) 
government-to-business (G2B) 
government-to-government (G2G) 
government-to-citizen (G2C) 
consumer-to-consumer (C2C) 
consumer-to-business (C2B) 
It is notable that there are comparably less connections pointing "upwards" than 
"downwards" (few employee/consumer/citizen-to-X models). 
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Electronic commerce 


Electronic commerce, commonly known as e-commerce or eCommerce, consists of the 
buying and selling of products or services over electronic systems such as the Internet and other 
computer networks. The amount of trade conducted electronically has grown dramatically since 
the wide introduction of the Internet. A wide variety of commerce is conducted in this way, 
including things such as electronic funds transfer, supply chain management, e-marketing, online 
marketing, online transaction processing, electronic data interchange (EDI), automated inventory 
management systems, and automated data collection systems. Modern electronic commerce 
typically uses the World Wide Web at at least some point in the transaction's lifecycle, although 
it can encompass a wide range of technologies such as e-mail as well. 

A small percentage of electronic commerce is conducted entirely electronically for "virtual" 
items such as access to premium content on a website, but most electronic commerce eventually 
involves physical items and their transportation in at least some way. 


History 


The meaning of the term "electronic commerce" has changed over the last 30 years. 
Originally, "electronic commerce" meant the facilitation of commercial transactions 
electronically, usually using technology like Electronic Data Interchange (EDI) and Electronic 
Funds Transfer (EFT), where both were introduced in the late 1970s, for example, to send 
commercial documents like purchase orders or invoices electronically. 

The ‘electronic’ or 'e' in e-commerce refers to the technology/systems; the 'commerce' refers 
to be traditional business models. E-commerce is the complete set of processes that support 
commercial business activities on a network. In the 1970s and 1980s, this would also have 
involved information analysis. The growth and acceptance of credit cards, automated teller 
machines (ATM) and telephone banking in the 1980s were also forms of e-commerce. However, 
from the 1990s onwards, this would include enterprise resource planning systems (ERP), data 
mining and data warehousing. 

In the dot com era, it came to include activities more precisely termed "Web commerce" -- 
the purchase of goods and services over the World Wide Web, usually with secure connections 
(HTTPS, a special server protocol that encrypts confidential ordering data for customer 
protection) with e-shopping carts and with electronic payment services, like credit card payment 
authorizations. 

Today, it encompasses a very wide range of business activities and processes, from e- 
banking to offshore manufacturing to e-logistics. The ever growing dependence of modern 
industries on electronically enabled business processes gave impetus to the growth and 
development of supporting systems, including backend systems, applications and middleware. 


386 


NICOLAE SFETCU: WEB DESIGN & DEVELOPMENT 


Examples are broadband and fibre-optic networks, supply-chain management software, customer 
relationship management software, inventory control systems and financial accounting software. 

When the Web first became well-known among the general public in 1994, many journalists 
and pundits forecast that e-commerce would soon become a major economic sector. However, it 
took about four years for security protocols (like HTTPS) to become sufficiently developed and 
widely deployed. Subsequently, between 1998 and 2000, a substantial number of businesses in 
the United States and Western Europe developed rudimentary web sites. 

Although a large number of "pure e-commerce" companies disappeared during the dot-com 
collapse in 2000 and 2001, many "brick-and-mortar" retailers recognized that such companies 
had identified valuable niche markets and began to add e-commerce capabilities to their Web 
sites. For example, after the collapse of online grocer Webvan, two traditional supermarket 
chains, Albertsons and Safeway, both started e-commerce subsidiaries through which consumers 
could order groceries online. 

The emergence of e-commerce also significantly lowered barriers to entry in the selling of 
many types of goods; accordingly many small home-based proprietors are able to use the internet 
to sell goods. Often, small sellers use online auction sites such as eBay, or sell via large 
corporate websites like Amazon.com, in order to take advantage of the exposure and setup 
convenience of such sites. 


Success factors 


In many cases, an e-commerce company will survive not only based on its product, but by 
having a competent management team, good post-sales services, well-organized business 
structure, network infrastructure and a secured, well-designed website. A company that wants to 
succeed will have to perform 2 things: Technical and organizational aspects and customer- 
oriented. Following factors will make business of companies succeed in e-commerce: 


Technical and organizational aspects 


1. Sufficient work done in market research and analysis. E-commerce is not 
exempt from good business planning and the fundamental laws of supply and 
demand. Business failure is as much a reality in e-commerce as in any other 
form of business. 

2. A good management team armed with information technology strategy. A 
company's IT strategy should be a part of the business re-design process. 

3. Providing an easy and secured way for customers to effect transactions. Credit 
cards are the most popular means of sending payments on the internet, 
accounting for 90% of online purchases. In the past, card numbers were 
transferred securely between the customer and merchant through independent 
payment gateways. Such independent payment gateways are still used by most 
small and home businesses. Most merchants today process credit card 
transactions on site through arrangements made with commercial banks or 
credit cards companies. 
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4. Providing reliability and security. Parallel servers, hardware redundancy, fail- 
safe technology, information encryption, and firewalls can enhance this 
requirement. 

5. Providing a 360-degree view of the customer relationship, defined as ensuring 
that all employees, suppliers, and partners have a complete view, and the same 
view, of the customer. However, customers may not appreciate the big brother 
experience. 

6. Constructing a commercially sound business model. 

7. Engineering an electronic value chain in which one focuses on a "limited" 
number of core competencies -- the opposite of a one-stop shop. (Electronic 
stores can appear either specialist or generalist if properly programmed.) 

8. Operating on or near the cutting edge of technology and staying there as 
technology changes (but remembering that the fundamentals of commerce 
remain indifferent to technology). 

9. Setting up an organization of sufficient alertness and agility to respond quickly 
to any changes in the economic, social and physical environment. 

10. Providing an attractive website. The tasteful use of colour, graphics, animation, 
photographs, fonts, and white-space percentage may aid success in this respect. 

11.Streamlining business processes, possibly through re-engineering and 
information technologies. 

12. Providing complete understanding of the products or services offered, which 
not only includes complete product information, but also sound advisors and 
selectors. 

Naturally, the e-commerce vendor must also perform such mundane tasks as being truthful 
about its product and its availability, shipping reliably, and handling complaints promptly and 
effectively. A unique property of the Internet environment is that individual customers have 
access to far more information about the seller than they would find in a brick-and-mortar 
situation. (Of course, customers can, and occasionally do, research a brick-and-mortar store 
online before visiting it, so this distinction does not hold water in every case.) 


Customer experience 


A successful e-commerce organization must also provide an enjoyable and rewarding 
experience to its customers. Many factors go into making this possible. Such factors include: 

1. Providing value to customers. Vendors can achieve this by offering a product or 
product-line that attracts potential customers at a competitive price, as in non- 
electronic commerce. 

2. Providing service and performance. Offering a responsive, user-friendly 
purchasing experience, just like a flesh-and-blood retailer, may go some way to 
achieving these goals. 

3. Providing an incentive for customers to buy and to return. Sales promotions to 
this end can involve coupons, special offers, and discounts. Cross-linked 
websites and advertising affiliate programs can also help. 
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Providing personal attention. Personalized web sites, purchase suggestions, and 
personalized special offers may go some of the way to substituting for the face- 
to-face human interaction found at a traditional point of sale. 

Providing a sense of community. Chat rooms, discussion boards, soliciting 
customer input and loyalty programs (sometimes called affinity programs) can 
help in this respect. 

Owning the customer's total experience. E-tailers foster this by treating any 
contacts with a customer as part of a total experience, an experience that 
becomes synonymous with the brand. 

Letting customers help themselves. Provision of a self-serve site, easy to use 
without assistance, can help in this respect. This implies that all product 
information is available, cross-sell information, advise for product alternatives, 
and supplies & accessory selectors. 

Helping customers do their job of consuming. E-tailers and online shopping 
directories can provide such help through ample comparative information and 
good search facilities. Provision of component information and safety-and- 
health comments may assist e-tailers to define the customers' job. 


Even if a provider of E-commerce goods and services rigorously follows these "key factors" 
to devise an exemplary e-commerce strategy, problems can still arise. Sources of such problems 


include: 


1. 


Failure to understand customers, why they buy and how they buy. Even a 
product with a sound value proposition can fail if producers and retailers do not 
understand customer habits, expectations, and motivations. E-commerce could 
potentially mitigate this potential problem with proactive and focused 
marketing research, just as traditional retailers may do. 

Failure to consider the competitive situation. One may have the will to construct 
a viable book e-tailing business model, but lack the capability to compete with 
Amazon.com. 

Inability to predict environmental reaction. What will competitors do? Will they 
introduce competitive brands or competitive web sites? Will they supplement 
their service offerings? Will they try to sabotage a competitor's site? Will price 
wars break out? What will the government do? Research into competitors, 
industries and markets may mitigate some consequences here, just as in non- 
electronic commerce. 

Over-estimation of resource competence. Can staff, hardware, software, and 
processes handle the proposed strategy? Have e-tailers failed to develop 
employee and management skills? These issues may call for thorough resource 
planning and employee training. 

Failure to coordinate. If existing reporting and control relationships do not 
suffice, one can move towards a flat, accountable, and flexible organizational 
structure, which may or may not aid coordination. 
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6. Failure to obtain senior management commitment. This often results in a failure 
to gain sufficient corporate resources to accomplish a task. It may help to get 
top management involved right from the start. 

7. Failure to obtain employee commitment. If planners do not explain their 
strategy well to employees, or fail to give employees the whole picture, then 
training and setting up incentives for workers to embrace the strategy may 
assist. 

8. Under-estimation of time requirements. Setting up an e-commerce venture can 
take considerable time and money, and failure to understand the timing and 
sequencing of tasks can lead to significant cost overruns. Basic project planning, 
critical path, critical chain, or PERT analysis may mitigate such failings. 
Profitability may have to wait for the achievement of market share. 

9. Failure to follow a plan. Poor follow-through after the initial planning, and 
insufficient tracking of progress against a plan can result in problems. One may 
mitigate such problems with standard tools: benchmarking, milestones, 
variance tracking, and penalties and rewards for variances. 

10. Becoming the victim of organized crime. Many syndicates have caught on to the 
potential of the Internet as a new revenue stream. Two main methods are as 
follows: (1) Using identity theft techniques like phishing to order expensive 
goods and bill them to some innocent person, then liquidating the goods for 
quick cash; (2) Extortion by using a network of compromised "zombie" 
computers to engage in distributed denial of service attacks against the target 
Web site until it starts paying protection money. 

11. Failure to expect the unexpected. Too often new businesses do not take into 
account the amount of time, money or resources needed to complete a project 
and often find themselves without the necessary components to become 
successful. 


Product suitability 


Certain products or services appear more suitable for online sales; others remain more 
suitable for offline sales. While credit cards are currently the most popular means of paying for 
online goods and services, alternative online payments will account for 26% of e-commerce 
volume by 2009 according to Celent.!"! 

Many successful purely virtual companies deal with digital products, (including information 
storage, retrieval, and modification), music, movies, office supplies, education, communication, 
software, photography, and financial transactions. Examples of this type of company include: 
Google, eBay and Paypal. Other successful marketers such as use Drop shipping or Affiliate 
marketing techniques to facilitate transactions of tangible goods without maintaining real 
inventory. Examples include numerous sellers on eBay. 

Virtual marketers can sell some non-digital products and services successfully. Such products 
generally have a high value-to-weight ratio, they may involve embarrassing purchases, they may 
typically go to people in remote locations, and they may have shut-ins as their typical purchasers. 
Items which can fit through a standard letterbox — such as music CDs, DVDs and books — are 
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particularly suitable for a virtual marketer, and indeed Amazon.com, one of the few enduring 
dot-com companies, has historically concentrated on this field. 

Products such as spare parts, both for consumer items like washing machines and for 
industrial equipment like centrifugal pumps, also seem good candidates for selling online. 
Retailers often need to order spare parts specially, since they typically do not stock them at 
consumer outlets -- in such cases, e-commerce solutions in spares do not compete with retail 
stores, only with other ordering systems. A factor for success in this niche can consist of 
providing customers with exact, reliable information about which part number their particular 
version of a product needs, for example by providing parts lists keyed by serial number. 

Purchases of pornography and of other sex-related products and services fulfill the 
requirements of both virtuality (or if non-virtual, generally high-value) and potential 
embarrassment; unsurprisingly, provision of such services has become the most profitable 
segment of e-commerce. 

There are also many disadvantages of e-commerce, one of the main ones is fraud. This is 
where your details (name, bank card number, age, national insurance number) are entered into 
what look to be a safe site but really it is not. These details can then be used to steal money from 
you and can be used to buy things on line that you are completely unaware of until it is too late. 
If this information is leaked into the wrong hands. People are able to steal your identity, and 
commit more fraud crimes under your name. Finally there are many problems with e commerce 
some of which are: 

Failure to understand customers, why they buy and how they buy. Even a product with a 
sound value proposition can fail if producers and retailers do not understand customer habits, 
expectations, and motivations. E-commerce could potentially mitigate this potential problem 
with proactive and focused marketing research, just as traditional retailers may do. Failure to 
consider the competitive situation. One may have the will to construct a viable book e-tailing 
business model, but lack the capability to compete with Amazon. Inability to predict 
environmental reaction. What will competitors do? Will they introduce competitive brands or 
competitive web sites? Will they supplement their service offerings? Will they try to sabotage a 
competitor's site? Will price wars break out? What will the government do? Research into 
competitors, industries and markets may mitigate some consequences here, just as in non- 
electronic commerce. Over-estimation of resource competence. Can staff, hardware, software, 
and processes handle the proposed strategy? Have e-tailer's failed to develop employee and 
management skills? These issues may call for thorough resource planning and employee training. 

Products less suitable for e-commerce include products that have a low value-to-weight ratio, 
products that have a smell, taste, or touch component, products that need trial fittings —- most 
notably clothing — and products where colour integrity appears important. Nonetheless, 
Tesco.com has had success delivering groceries in the UK, albeit that many of its goods are of a 
generic quality, and clothing sold through the internet is big business in the U.S. Also, the 
recycling program Cheapcycle sells goods over the internet, but avoids the low value-to-weight 
ratio problem by creating different groups for various regions, so that shipping costs remain low. 


Acceptance 
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Consumers have accepted the e-commerce business model less readily than its proponents 
originally expected. Even in product categories suitable for e-commerce, electronic shopping has 
developed only slowly. Several reasons might account for the slow uptake, including: 


Concerns about security. Many people will not use credit cards over the Internet 
due to concerns about theft and credit card fraud. 

Lack of instant gratification with most e-purchases (non-digital purchases). 
Much of a consumer's reward for purchasing a product lies in the instant 
gratification of using and displaying that product. This reward does not exist 
when one's purchase does not arrive for days or weeks. 

The problem of access to web commerce, mainly for poor households and for 
developing countries. Low penetration rates of Internet access in some sectors 
greatly reduces the potential for e-commerce. 

The social aspect of shopping. Some people enjoy talking to sales staff, to other 
shoppers, or to their cohorts: this social reward side of retail therapy does not 
exist to the same extent in online shopping. 

Poorly designed, bug-infested e-Commerce web sites that frustrate online 
shoppers and drive them away. 

Inconsistent return policies among e-tailers or difficulties in exchange/return. 
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Freight exchange 


A freight exchange is an online, real time service where hauliers search for freight (eg. for a 
backload) and freight forwarders offer freight to be transported. 


What a freight exchange does 


These systems provide a platform that allows carriers to communicate freight traffic 
information to fellow operators such as transporters, forwarders and logistics companies. They 
allow forwarders to advertise their freight either privately or publicly to a large number of freight 
operators that are looking for loads. They also allow freight operators to offer vehicle space. 
Online systems are normally subscription-based with a small charge for advertising (posting) and 
searching (consulting. 
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History 


The world's first electronic freight exchange was called Teleroute and was launched in 
France on the Minitel system in 1985. Now, there are many examples around the world offering 
many services to hauliers and freight forwarders. 


An example 


Imagine you have an order to transport tulips from Keukenhof in the Netherlands to Como, 
Italy but you don’t have a freight order for the return trip and you would like to find return 
freight to increase your profitibility. You can achieve this by searching for return freight or 
‘packload' on a freight exchange. 


See also 


e Electronic commerce 


Mashup 


A mashup is a web application that combines data from more than one source into an 
integrated experience. The etymology of this term possibly derives from its similar use in pop 
music. 


Overview 


Content used in mashups is typically sourced from a third party via a public interface or API, 
although some in the community believe that only cases where private interfaces are not used 
count as mashups. Other methods of sourcing content for mashups include Web feeds (e.g. RSS 
or Atom), web services and Screen scraping. Many people are experimenting with mashups 
using Microsoft, Google, eBay, Amazon, Flickr, and Yahoo APIs, which has led to the creation 
of Mashup Editors. 


Quotes 


“a lot of talk about Web 2.0, web mashups, Ajax etc., which in my mind are all facets of the 
same phenomenon: that information and presentation are being separated in ways that 
allow for novel forms of reuse.” 


Sho Kuwamoto 


“The mash-up part of this equation, is the offspring of an environment where application 
developers see it in their own selfish interest to facilitate the creation of integrated, yet 
highly derivative application hybrids by third parties, something they do by providing rich 
public APIs to their user base.” 

Mark Sigal 
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Online banking 


Online banking (or Internet banking) is a term used for performing transactions, payments 
etc. over the Internet through a bank, credit union or building society's secure website. This 
allows customers to do their banking outside of bank hours and from anywhere where Internet 
access is available. In most cases a web browser 1s utilized and any normal Internet connection is 
suitable. No special software or hardware is usually needed. 


Features 


Online banking usually offers such features as: 
e Bank statements, with the possibility to import data in a personal finance 
program such as Quicken or Microsoft Money 
e Electronic bill payment 
e Funds transfer between a customer's own checking and savings accounts, or to 
another customer's account 
e Investment purchase or sale 
e Loan applications and transactions, such as repayments 
e Account aggregation to allow the customers to monitor all of their accounts in 
one place whether they are with their main bank or with other institutions. 
There are a growing number of so-called virtual banks that operate exclusively online. These 
online banks have low costs compared to traditional banks and so they often offer higher interest 
rates. 


Security 


Protection through single password authentication, as is the case in most secure Internet 
shopping sites, is not considered secure enough for personal online banking applications in some 
countries. Online banking user interfaces are secure sites (generally employing the https 
protocol) and traffic of all information - including the password - is encrypted, making it next to 
impossible for a third party to obtain or modify information after it is sent. However, encryption 
alone does not rule out the possibility of hackers gaining access to vulnerable home PCs and 
intercepting the password as it is typed in (keystroke logging). There is also the danger of 
password cracking and physical theft of passwords written down by careless users. 

Many online banking services therefore impose a second layer of security. Strategies vary, 
but a common method is the use of transaction numbers, or TANs, which are essentially single 
use passwords. Another strategy is the use of two passwords, only random parts of which are 
entered at the start of every online banking session. This is however slightly less secure than the 
TAN alternative and more inconvenient for the user. A third option, used in many European 
countries and currently being trialled in the UK is providing customers with security token 
devices capable of generating single use passwords unique to the customer's token (this is called 
two-factor authentication or 2FA). Another option is using digital certificates, which digitally 
sign or authenticate the transactions, by linking them to the physical device (e.g. computer, 
mobile phone, etc). While most online banking in the United States still uses single password 
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protection, the FFIEC issued regulations requiring that banks implement more secure 
authentication mechanisms by the end of 2006. Most large U.S. banks have responded not with 
security tokens or digital certificates, but by setting up a combination of controls that recognize a 
customer's computer, ask additional challenge questions for risky behavior, and monitor for 
fraudulent behavior. 

Banks in many European countries (including the Scandinavian countries, the Netherlands, 
Austria and Belgium) are offering online banking for e-commerce payments directly from 
customer to merchants. For instance, see iDEAL. 


Fraud 


Some customers avoid online banking as they perceive it as being too vulnerable to fraud. 
The security measures employed by most banks can never be completely safe, but in practice the 
number of fraud victims due to online banking is very small. This is probably due to the fact that 
a relatively small number of people use Internet banking compared with the total number of 
banking customers world wide. Indeed, conventional banking practices may be more prone to 
abuse by fraudsters than online banking. Credit card fraud, signature forgery and identity theft 
are far more widespread "offline" crimes than malicious hacking. Bank transactions are generally 
traceable and criminal penalties for bank fraud are high. Online banking becomes less secure if 
users are careless, gullible or computer illiterate. An increasingly popular criminal practice to 
gain access to a user's finances is phishing, whereby the user is in some way persuaded to hand 
over their password(s) to a fraudster. 


Online gambling 


Online gambling is a general term for gambling using the Internet. This article provides a 
brief introduction to some of the forms of online gambling, as well as discussing general issues. 


Online poker 


Online poker tables commonly offer Texas hold 'em, Omaha, Seven-card stud,razz and other 
game types in both tournament and ring game structures. Players play against each other rather 
than the "house", with the card room making its money through "rake" and through tournament 
fees. 


Online casinos 
There are a large number of online casinos, in which people can play casino games such as 


roulette, blackjack, pachinko, baccarat and many others. These games are played against the 
"house", which makes money due to the fact that the odds are in its favor. 


Online sports betting 


Bookmakers and betting exchanges offer fixed-odds gambling over the Internet on the results 
of sporting events. 
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Online bingo 


There are a number of online bingo rooms offering games on the Internet. 


Mobile gambling 


Developments in the use of wireless, mobile devices to gamble follow in the wake of 
mainstream online gambling. 


Funds transfers 


Typically, gamblers upload funds to the online gambling company, make bets or play the 
games that it offers, and then cash out any winnings. European gamblers can often fund 
gambling accounts by credit card or debit card, and cash out winnings directly back to the card. 
However, most US banks prohibit the use of their cards for the purpose of internet gambling, and 
attempts by Americans to use credit cards at internet gambling sites are usually rejected.[1] A 
number of electronic money services offer accounts with which online gambling can be funded. 
However, many top fund-transfer sites such as FirePay, Neteller & Moneybookers have 
discontinued service for U.S. residents. 

Payment by check and wire transfer is also common. 


Legality 
United States 


The United States Court of Appeals for the Fifth Circuit ruled[2] in November 2002 that the 
Federal Wire Act prohibits electronic transmission of information for sports betting across state 
lines but affirmed a lower court ruling”! that the Wire Act "in plain language' does not prohibit 
Internet gambling on a game of chance." 

Some states have specific laws against online gambling of any kind. Also, owning an online 
gaming operation without proper licensing would be illegal, and no states are currently granting 
online gaming licenses. 

In March 2003, Deputy Assistant Attorney General John G. Malcolm testified before the 
Senate Banking Committee regarding the special problems presented by online gambling." A 
major concern of the United States Department of Justice is online money laundering. The 
anonymous nature of the Internet and the use of encryption make it especially difficult to trace 
online money laundering transactions. 

In April 2004 Google and Yahoo!, the internet's two largest search engines, announced that 
they were removing online gambling advertising from their sites. The move followed a United 
States Department of Justice announcement that, in what some say is a contradiction of the 
Appeals Court ruling, the Wire Act relating to telephone betting applies to all forms of Internet 
gambling, and that any advertising of such gambling "may" be deemed as aiding and abetting. 
Critics of the Justice Department's move say that it has no legal basis for pressuring companies 
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to remove advertisements and that the advertisements are protected by the First Amendment. As 
of April 2005, Yahoo! has provided advertising for "play money" online gaming. 

In August 2004, Casino City, an online portal for internet gambling sites, sued the US 
Department of Justice. The complaint alleged, inter alia, that the website's business—promoting 
internet gambling—was legal, and requested a declaration from the court that its business was 
protected by the First Amendment. The U.S. District Court for the Middle District of Louisiana 
dismissed the case in February of 2005. 

In its opinion, the District Court wrote, 

It is well-established that the First Amendment does not protect the right to advertise illegal 

activity... The government's interest is specifically directed towards the advertising of illegal 

activity, namely Internet gambling... Furthermore, the speech in which the plaintiff wishes to 
engage is misleading because it falsely portrays the image that Internet gambling is legal... 

Because plaintiff's speech concerns misleading information and illegal activities, it does not 

fall within the speech that is protected by the First Amendment."! 

The US Court of Appeals, 5th Circuit, dismissed Casino City's appeal in January, 2006.'*! 

In February 2005 the North Dakota House of Representatives passed a bill to legalize and 
regulate online poker and online poker cardroom operators in the State. Testifying before the 
State Senate, Nigel Payne, CEO of Paradise Poker, pledged to relocate to the state if the bill 
became law. However, the measure was defeated by the State Senate in March 2005. Rep. Jim 
Kasper, who sponsored the 2005 legislation, plans to introduce similar bills in the 2007 North 
Dakota legislative session. 

In July 2006, David Carruthers, the CEO of BetonSports, a company publicly traded on the 
London Stock Exchange was detained in Texas while changing planes on his way from London 
to Costa Rica.[7] He and ten other individuals had been previously charged in a sealed 
indictment with violations of US Federal laws relating to illegal gambling. While as noted above, 
a United States Appeals court has stated that the Wire Act does not apply to non-sports betting, 
the Supreme Court of the United States previously refused to hear an appeal of the conviction of 
Jay Cohen, where lower courts held that the Wire Act does make it illegal to own a sports betting 
operation that offers such betting to United States citizens.'*! 

The BetOnSports indictment”! alleged violations of at least 9 different Federal statutes, 
including 18 USC Sec. 1953 (Operation of an Illegal Gambling Business). Carruthers is currently 
under house arrest on a one million dollar bail bond!” 

In September 2006, SportingBet PLC reported that its chairman, Peter Dicks, was detained in 
New York City on a Louisiana warrant while traveling in the United States on business unrelated 
to online gaming.) Louisiana is one of the few states that has a specific law prohibiting 
gambling online. At the end of the month, New York dismissed the Louisiana warrant."”! 

Also in September 2006, just before adjourning for the midterm elections, both the House of 
Representatives and Senate passed legislation (as an amendment to the unrelated Safe Port Act) 
that would make transactions from banks or similar institutions to online gambling sites illegal. 
This differs from a previous bill passed only by the House that expanded the scope of the Wire 
Act. The passed bill only addresses banking issues.[13] The act was signed into law on October 
13, 2006 by President George W. Bush, and there is a provision for a 270-day period to develop 
enforcement measures. At the bill-signing ceremony, Bush never mentioned the Internet 
gambling measure, which was supported by the National Football League and opposed by 
banking groups.'"! 
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In response to this new legislation, a number of online gambling operators including 
PartyGaming, The bwin Group, Cassava Enterprises, and Sportingbet announced that real-money 
gambling operations would be suspended for U.S. customers. PartyGaming's stock dropped by 
60% following its announcement. Other operators such as PokerStars, Bodog, and WSEX.com 
announced their intention to continue serving customers in the U.S. 

On April 26, 2007, Rep. Barney Frank (D-MA) introduced HR 2046, the Internet Gambling 
Regulation and Enforcement Act (IGREA). The IGREA would modify the UIGEA by providing 
a provision for licensing of Internet gambling facilities by the Director of the Financial Crimes 
Enforcement Network. On June 8, 2007, the House Financial Services Committee, chaired by 
Rep. Barney Frank, held a hearing entitled, "Can Internet Gambling Be Effectively Regulated to 
Protect Consumers and the Payments System?".''*! Expert witnesses at the hearing testified that 
Internet gambling can be effectively regulated for age verification, money laundering issues, 
facilitation of state and federal tax collection, and for issues relating to compulsive gambling. 

On June 7, 2007, Rep. Robert Wexler (D-FL) introduced HR 2610, the Skill Game Protection 
Act. This act would legalize Internet poker, bridge, chess, and other games of skill. Also on June 
7, Rep. Jim McDermott [D-WA] introduced H.R. 2607, the Internet Gambling Tax Act. The 
IGTA would legislate Internet gambling tax collection requirements. 


Australia 


On the 28th of June 2001 the Australian Government passed the Interactive Gambling Act 
2001 (IGA). The government said that the IGA was important to protect Australians from the 
harmful effects of gambling. 

The IGA targets the providers of interactive gambling services, not their potential or actual 
customers. The IGA makes it an offence to provide an interactive gambling service to a customer 
physically present in Australia, but it is not an offence for Australian residents to play poker or 
casino games online. In stark contrast to the USA, sports betting online is also completely legal 
in Australia, with many state government licensed sportsbooks in operation, such as Centrebet, 
Sportingbet & Betfair. 

The offence applies to all interactive gambling service providers, whether based in Australia 
or offshore, whether Australian or foreign owned. The offence carries a maximum penalty of 
$220,000 per day for individuals and $1.1 million per day for bodies corporate. 

More information regarding the Interactive Gambling Act 2001 can be found here, Australian 
Department of Communications, Information Technology and Arts. 

Complaints regarding Online gambling facilities serving Australian users can be made to the 
Australian Communication and Media Authority at ACMA Homepage. 


Other countries 
Various forms of online gambling are legal and regulated in many countries, including most 
members of the European Union and several nations in and around the Caribbean Sea. 


In India it is neither legal nor illegal the Law is silent on the issue, but in the state of 
Maharashtra it is a banned offence under the "Bombay Wager Act”. 
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The government of the island nation of Antigua and Barbuda, which licenses Internet 
gambling entities, made a complaint to the World Trade Organization about the U-S. 
government's actions to impede online gaming. The Caribbean country won the preliminary 
ruling but WTO's appeals body somewhat narrowed that favorable ruling in April 2005. The 
appeals decision held that various state laws argued by Antigua and Barbuda to be contrary to 
WTO agreements were not sufficiently discussed during the course of the proceedings to be 
properly assessed by the panel. However, the appeals panel also ruled that the Wire Act and two 
other federal statutes prohibiting the provision of gambling services from Antigua to the United 
States violated the WTO's General Agreement on Trade in Services, or "GATS". Although the 
United States convinced the appeals panel that these laws were "necessary" to protect public 
health and morals, the asserted United States defense on these grounds was ultimately rejected 
because its laws relating to remote gambling on horse-racing were not applied equally to foreign 
and domestic online betting companies, and thus the United States could not establish that its 
laws were non-discriminatory.""« 

On March 30, 2007 the WTO confirmed the U.S. "had done nothing to abide by an earlier 
verdict that labeled some U.S. Internet gambling restrictions as illegal.""”! 

On June 19, 2007, Antigua filed a claim with the WTO for USD $3.4 billion in trade 
sanctions against the United States, along with a request for authorization to ignore U.S. patent 
and copyright laws. This followed by a day similar demands for compensation made by the 
European Union. "*! 


Problem gambling 


In the United States in 1999 the National Gambling Impact Study stated "the high-speed 
instant gratification of Internet games and the high level of privacy they offer may exacerbate 
problem and pathological gambling". Recently in the UK another government-funded report 
came to a similar conclusion, claiming that 75% of people who gamble online are "problem" or 
"pathological" gamblers, compared to just 20% of people who visit legitimate land-based 
casinos.!""! 


Money laundering 


It has also been alleged that the largely unsupervised electronic funds transfers inherent in 
online gambling are being exploited by criminal interests to launder large amounts of illegal 
cash."! 


Online Office 


Online Office suites are offered by websites in the form of software as a service. They offer 
the ability to use the common office tools online. They can be accessed online from any Internet 
enabled device running any operating system. This allows people to work together world wide 
and at any time thereby leading to international web based collaboration and virtual teamwork. 
Usually, the basic versions are offered for free and for more advanced versions one is required to 
pay a nominal subscription fee. 
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Advantages 


e No software is required to be downloaded and installed. 

e No upgrading hassles when new features are added or bugs eliminated. 

e They can run out of thin clients with minimal hardware requirements. 

e They also provide the ability to share a document with several users without 
your own server. 

e No need to purchase or upgrade a software license. Instead these are available 
as software as a service. 


Disadvantages 


e Data Security - All your data resides on a remote server. 

e Speed - Most of the currently available online office suites require a high 
broadband Internet connection. 

e Lack of all features available on the offline office suites. 


Components 


Typically online office suites include a word processor, a spreadsheet element, a presentation 
program, a Accountancy module, and a communications tool. 


See also 


e List of online spreadsheets 


Shop bot 


Shop Bots, also commonly called buyer agents, are sophisticated autonomous software 
agents that comb the internet providing users with low product prices and/or product 
recommendations. Most commonly used for commodity type products where competitor 
differentiation is limited to price, shopping bots are also used by many internet vendors to 
promote impulse purchases by recommending similar goods. 

Although heavily relied on by many internet consumers, shopping bots are not always 
concerned with the best interests of their users. The majority of shopping bots operate for profit 
and work in conjunction with numerous internet merchants advertising their products and 
specials when certain key terms are searched. Either charging advertising/listing fees or charging 
click through fees, shopping bots make their revenue directing consumers to internet vendors. 
However, some internet retailers employ their own shopping bots that use collaborative filtering 
technologies to recommend goods and services to users through identifying purchasing trends. 


References 
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Web application development 
Web application development is the process and practice of developing web applications . 
Risk 


Just as with a traditional desktop application, web applications have varying levels of risk. A 
personal home page is much less risky than, for example, a stock trading web site. For some 
projects security, software bugs, etc. are major issues. If time to market, or technical complexity 
is a concern, documentation, test planning, change control, requirements analysis, architectural 
description and formal design and construction practices can mitigate risk. 


Technologies 


e Ajax 
e ASP 
ASP.NET 
e CSS 
e ColdFusion 
Java EE 
JavaScript 
Perl 
PHP 
Ruby, including Ruby on Rails 
CGI 
ISAPI/NSAPI 


Lifecycle Model 


Time to market, company-growth and requirements churn, three things that are emphasized 
in web-based business, coincide with the principles of the Agile practices. Some agile lifecycle 
models are: 


e Extreme Programming 
Scrum 
Timebox development 


Feature Driven Development 
Testing 
Like traditional desktop applications, web applications undergo the same unit, integration 


and system testing. But because web application clients vary so greatly, teams might perform 
some additional testing, such as: 
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e Security 
Performance, Load, and Stress 
HTML/CSS validation 
Accessibility 
Usability 


Many types of tests are automatable. At the component level, one of the xUnit packages can 
be a helpful tool. Or an organization can create its own unit testing framework. At the GUI level, 
Watir is useful. 


Tools 


In the case of ASP.NET, a developer can use Microsoft Visual Studio to write code. But, as 
with most other programming languages, he/she can also use a text editor. Notepad++ is an 
example. 

For PHP, the Zend Development Environment provides numerous debugging tools and 
provides a rich feature set to make a PHP developer's life easier. 

Several code generation tools such as dbQwikSite are available to automate the development 
of code. Using such tools, non-technical users can produce working code, and experienced 
coders can accellerate the development cycle. 

Other tools include various browsers, FTP clients, etc. 


Frameworks and Reuse 


Main article: Web application framework 

Practicing code reuse and using web application frameworks can greatly improve both 
productivity and time to market (McConnell 1996:537). Reusing externally developed 
components can allow an organization to reap the above benefits, while potentially saving 
money. However, for smaller components, it might be just as easy to develop your own 
components as it would be to learn new APIs. Also, if a component is essential to the business, 
an organization might want to control its development. 


See also 


e Web Interoperability 
e Web application framework 
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Web-based email 


Web-based email or webmail is a term referring to an e-mail service intended to be 
primarily accessed via an web browser, as opposed to through an application such as Mozilla's 
Thunderbird, Apple's Mail or Microsoft Outlook. 

One of the earliest Webmail services was Hotmail. Now, the most popular Webmail 
providers are arguably Gmail and Yahoo! Mail.[1] Some other webmail services include AIM 
Mail, Mail.com, Lycos Mail, and Windows Live Hotmail; see also Wikipedia's Comparison of 
webmail providers. 

A major advantage of web-based email over application-based email is that a user has the 
ability to access their inbox from any Internet-connected computer. However, the need for 
Internet access is also a setback, in that one cannot access their old messages, or work on new 
ones, when they are not connected to the Internet. 

In 1997, prior to its acquisition by Microsoft, Hotmail introduced its service, which became 
one of the first popular web-based email offerings. Following Hotmail's initial success, Yahoo! 
released their webmail service, which also grew to become widely popular. 

Google's introduction of Gmail in 2004 sparked a period of rapid development in webmail, 
due to Gmail's new features such as JavaScript menus, text-based ads, and massive storage.[2] 
Other webmail providers responded by offering similar features, such as Yahoo!'s introduction of 
its new version of Yahoo! Mail. 


Software packages 


There are also software packages that allow an organization such as company to offer email 
through the web for their associates. Some solutions are open source software like SquirrelMail 
and others are closed source like the Outlook Web Access module for Microsoft Exchange, 
Socketmail and Atmail. 

Conversely, there are programs that can simulate a web browser to access web mail as if it 
were stored in a POP3 or IMAP account. They are susceptible, though, to changes in the user 
interface of the web service since there is no standard interface. 


Rendering and Compatibility 


There are important differences in rendering capabilities for many popular web mail services 
such as Gmail, Windows Live mail, Hotmail and Yahoo Mail. Due to the various treatment of 
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HTML tags, such as <style> and <head>, as well as CSS rendering inconsistancies, email 
marketing companies rely on older web development techniques to send cross-platform mail. 
This usually means a heavy reliance on HTML tables and inline CSS. 


Web service 


The W3C defines a Web service (many sources also capitalize the second word, as in Web 
Services) as a software system designed to support interoperable Machine to Machine interaction 
over a network. Web services are frequently just Web APIs that can be accessed over a network, 
such as the Internet, and executed on a remote system hosting the requested services. 

The W3C Web service definition encompasses many different systems, but in common usage 
the term refers to clients and servers that communicate using XML messages that follow the 
SOAP standard. Common in both the field and the terminology is the assumption that there is 
also a machine readable description of the operations supported by the server, a description in the 
Web Services Description Language (WSDL). The latter is not a requirement of a SOAP 
endpoint, but it is a prerequisite for automated client-side code generation in the mainstream Java 
and .NET SOAP frameworks. Some industry organizations, such as the WS-I, mandate both 
SOAP and WSDL in their definition of a Web service. 


Specifications 
Core specifications 


The specifications that define Web services are intentionally modular, and as a result there is 
no one document that contains them all. Additionally, there is neither a single, nor a stable set of 
specifications. There are a few "core" specifications that are supplemented by others as the 
circumstances and choice of technology dictate, including: 

SOAP 

An XML-based, extensible message envelope format, with "bindings" to underlying 

protocols. The primary protocols are HTTP and HTTPS, although bindings for others, 

including SMTP and XMPP, have been written. 
WSDL 

An XML format that allows service interfaces to be described, along with the details of 

their bindings to specific protocols. Typically used to generate server and client code, 

and for configuration. 
UDDI 

A protocol for publishing and discovering metadata about Web services, to enable 

applications to find Web services, either at design time or runtime. 

Most of these core specifications have come from W3C, including XML, SOAP, and WSDL; 
UDDI comes from OASIS. 

See List of Web service specifications for a more complete listing. 


Profiles 


404 


NICOLAE SFETCU: WEB DESIGN & DEVELOPMENT 


To improve interoperability of Web Services, the WS-I publishes profiles. A profile is a set 
of core specifications (SOAP, WSDL, ...) in a specific version (SOAP 1.1, UDDI 2, ...) with 
some additional requirements to restrict the use of the core specifications. The WS-I also 
publishes use cases and test tools to help deploying profile compliant Web Service. 


Additional specifications, WS-* 


Some specifications have been developed or are currently being developed to extend Web 
Services capabilities. These specifications are generally referred to as WS-*. Here is a non 
exhaustive list of these WS-* specifications. 

WS-Security 

Defines how to use XML Encryption and XML Signature in SOAP to secure message 

exchanges, as an alternative or extension to using HTTPS to secure the channel. 
WS-Reliability 

An OASIS standard protocol for reliable messaging between two Web services. 
WS-ReliableMessaging 

A protocol for reliable messaging between two Web services, issued by Microsoft, BEA 

and IBM it is currently being standardized by the OASIS organization [1]. 
WS-Addressing 

A way of describing the address of the recipient (and sender) of a message, inside the 

SOAP message itself. 

WS-Transaction 

A way of handling transactions. 

Some of these additional specifications have come from the W3C. There is much discussion 
around the organization's participation, as the general Web and the Semantic Web story appear to 
be at odds with much of the Web Services vision. This has surfaced most recently in February 
2007, at the Web of Services for the Enterprise workshop. Some of the participants advocated a 
withdrawal of the W3C from further WS-* related work, and a focus on the core Web. 

In contrast, OASIS has standardized many Web service extensions, including Web Services 
Resource Framework and WSDM. 


Styles of use 


Web services are a set of tools that can be used in a number of ways. The three most common 
styles of use are RPC, SOA and REST. 


Remote procedure calls 


RPC Web services present a distributed function (or method) call interface that is familiar to 
many developers. Typically, the basic unit of RPC Web services is the WSDL operation. 

The first Web services tools were focused on RPC, and as a result this style is widely 
deployed and supported. However, it is sometimes criticised for not being loosely coupled, 
because it was often implemented by mapping services directly to language-specific function or 


405 


NICOLAE SFETCU: WEB DESIGN & DEVELOPMENT 


method calls... Many vendors felt this approach to be a dead end, and pushed for RPC to be 
disallowed in the WS-I Basic Profile. 


Service-oriented architecture 


Web services can also be used to implement an architecture according to Service-oriented 
architecture (SOA) concepts, where the basic unit of communication is a message, rather than 
an operation. This is often referred to as "message-oriented" services. 

SOA Web services are supported by most major software vendors and industry analysts. 
Unlike RPC Web services, loose coupling is more likely, because the focus is on the "contract" 
that WSDL provides, rather than the underlying implementation details. 


Representational state transfer 


Finally, RESTful Web services attempt to emulate HTTP and similar protocols by 
constraining the interface to a set of well-known, standard operations (e.g., GET, PUT, 
DELETE). Here, the focus is on interacting with stateful resources, rather than messages or 
operations. 

RESTful Web services can use WSDL to describe SOAP messaging over HTTP, which 
defines the operations, or can be implemented as an abstraction purely on top of SOAP (e.g., 
WS-Transfer). 


Criticisms 


Critics of non-RESTful Web services often complain that they are too complex"! and biased 
towards large software vendors or integrators, rather than open source implementations. 

One big concern of the REST Web Service developers is that the SOAP WS toolkits make it 
easy to define new interfaces for remote interaction, often relying on introspection to extract the 
WSDL and service API from Java or C# code. This is viewed as a feature by the SOAP stack 
authors (and many users), but it is feared that it can increase the brittleness of the systems, in 
which a minor change on the server (even an upgrade of the SOAP stack) can result in different 
WSDL and a different service interface. Similarly, the client-side classes that can be generated 
from WSDL and XSD descriptions of the service are often tied to a particular version of the 
SOAP endpoint, and can break if the endpoint changes or the client-side SOAP stack is 
upgraded. Well designed SOAP endpoints (with handwritten XSD and WSDL) do not suffer 
from this, but there is still the problem that a custom interface for every service requires a custom 
client for every service. 

There are also concerns about performance, because of Web services’ use of XML as a 
message format and SOAP and HTTP in enveloping and transport. At the same time, there are 
also emerging XML parsing/indexing technologies, such as VTD-XML, that promise to address 
those XML-related performance issues. 


Similar efforts 
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There are several other approaches to the set of problems that Web services attempts to 
address, both preceding and contemporary to it. RMI was one of many middleware systems that 
have seen wide deployment. More ambitious efforts like CORBA and DCOM both attempted to 
effect distributed objects, which Web services implementations sometimes try to mimic. 

More basic efforts include XML-RPC, a precursor to SOAP that was only capable of RPC, 
and various forms of using HTTP without SOAP. 


Enterprise application integration | Instant messaging | Internet search | Web service 
specifications | List of web service protocols 


Enterprise application integration 


Enterprise Application Integration (EAD) is defined as the uses of software and computer 
systems architectural principles to integrate a set of enterprise computer applications. 


Rationale for EAI 


In today’s competitive and dynamic business environment, applications such as Supply 
Chain Management, Customer Relationship Management, Business Intelligence and Integrated 
Collaboration environments have become imperative for organizations that need to maintain 
their competitive advantage. Enterprise Application Integration (EAI) is the process of linking 
these applications and others in order to realize financial and operational competitive advantages. 

When different systems can’t share their data effectively, they create information bottlenecks 
that require human intervention in the form of decision making or data entry. With a properly 
deployed EAI architecture, organizations are able to focus most of their efforts on their value- 
creating core competencies instead of focusing on workflow management. 

For generations, systems have been built that have served a single purpose for a single set of 
users without sufficient thought to integrating these systems into larger systems and multiple 
applications. EAI is the solution to the unanticipated outcome of generations of development 
undertaken without a central vision or strategy. The demand of the enterprise is to share data and 
processes without having to make sweeping changes to the applications or data structures. Only 
by creating a method of accomplishing this integration can EAI be both functional and cost- 
effective. 

One of the challenges facing modern organizations is giving all their workers complete, 
transparent and real-time access to information. Many of the legacy applications still in use today 
were developed using arcane and proprietary technologies, thus creating information silos across 
departmental lines within organizations. These systems hampered seamless movement of 
information from one application to the other. EAI, as a discipline, aims to alleviate many of 
these problems, as well as create new paradigms for truly lean proactive organizations. EAI 
intends to transcend the simple goal of linking applications, and attempts to enable new and 
innovative ways of leveraging organizational knowledge to create further competitive advantages 
for the enterprise. 
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EAT is a response to decades of creating distributed monolithic, single purpose applications 
leveraging a hodgepodge of platforms and development approaches. EAI represents the solution 
to a problem that has existed since applications first moved from central processors. Put briefly, 
EAI is the “unrestricted sharing of data and business processes among any connected application 
or data sources in the enterprise.” "! 

Undoubtedly, there are a number of instances of stovepipe systems in an enterprise, such as 
inventory control systems, sales automation systems, general ledger systems, and human 
resource systems. These systems typically were custom-built with specific needs in mind, 
utilizing the technology-of-the-day. Many used non-standard data storage and application 
development technology. 


Improving connectivity 


Enterprise Application Integration has increased in importance because enterprise computing 
often takes the form of islands of automation. This occurs when the value of individual systems 
are not maximized due to partial or full isolation. If integration is applied without following a 
structured EAI approach, point-to-point connections grow across an organization. Dependencies 
are added on an impromptu basis, resulting in a tangled mess that is difficult to maintain. This is 
commonly referred to as spaghetti, an allusion to the programming equivalent of spaghetti code. 
For example: 

The number of n connections needed to have a fully meshed point-to-point connections is 

n(n —1) (10)(9) 
given by a Thus, for 10 applications to be fully integrated point-to-point, a a or 45 
point-to-point connections are needed. 

However, EAT is not just about sharing data between applications; it focuses on sharing both 
business data and business process. Attending to EAI involves looking at the system of systems, 
which involves large scale inter-disciplinary problems with multiple, heterogeneous, distributed 
systems that are embedded in networks at multiple levels. 


Purposes of EAI 


EAT can be used for different purposes: 

e Data (information) integration: ensuring that information in multiple systems is 
kept consistent. This is also known as EI] (Enterprise Information Integration). 

e Process integration: linking business processes across applications. 

e Vendor independence: extracting business policies or rules from applications 
and implementing them in the EAI system, so that even if one of the business 
applications is replaced with a different vendor's application, the business rules 
do not have to be re-implemented. 

e Common facade: An EAI system could front-end a cluster of applications, 
providing a single consistent access interface to these applications and shielding 
users from having to learn to interact with different applications. 


EAT Patterns 
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Integration Patterns 


There are two patterns that EAI systems implement: 

e Mediation: Here, the EAI system acts as the go-between or broker between 
multiple applications. Whenever an interesting event occurs in an application 
(e.g., new information created, new transaction completed, etc.) an integration 
module in the EAI system is notified. The module then propagates the changes 
to other relevant applications. 

e Federation: In this case, the EAI system acts as the overarching facade across 
multiple applications. All accesses from the ‘outside world’ to any of the 
applications are front-ended by the EAI system. The EAI system is configured to 
expose only the relevant information and interfaces of the underlying 
applications to the outside world, and performs all interactions with the 
underlying applications on behalf of the requester. 

Both patterns are often used concurrently. The same EAI system could be keeping multiple 
applications in sync (mediation), while servicing requests from external users against these 
applications (federation). 


Access Patterns 


EAI supports both asynchronous and synchronous access patterns, the former being typical in 
the mediation case and the latter in the federation case. 


Lifetime Patterns 


An integration operation could be short-lived (e.g., keeping data in sync across two 
applications could be completed within a second) or long-lived (e.g., one of the steps could 
involve the EAI system interacting with a human workflow application for approval of a loan 
that takes hours or days to complete). 


EAI Topologies 


There are two major topologies: hub-and-spoke, and bus. Each has its own advantages and 
disadvantages. In the hub-and-spoke model, the EAI system is at the center (the hub), and 
interacts with the applications via the spokes. In the bus model, the EAI system is the bus (or is 
implemented as a resident module in an already existing message bus or message-oriented 
middleware). 


Technologies 


Multiple technologies are used in implementing each of the components of the EAI system: 
e Bus/hub:This is usually implemented by enhancing standard middleware 
products (application server, message bus) or implemented as a stand-alone 
program (i.e., does not use any middleware), acting as its own middleware. 
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Application connectivity: The bus/hub connects to applications through a set 
of adapters (also referred to as connectors). These are programs that know 
how to interact with an underlying business application. The adapter performs 
two-way communication, performing requests from the hub against the 
application, and notifying the hub when an event of interest occurs in the 
application (a new record inserted, a transaction completed, etc.). Adapters can 
be specific to an application (e.g., built against the application vendor's client 
libraries) or specific to a class of applications (e.g., can interact with any 
application through a standard communication protocol, such as SOAP or 
SMTP). The adapter could reside in the same process space as the bus/hub or 
execute in a remote location and interact with the hub/bus through industry 
standard protocols such as message queues, web services, or even use a 
proprietary protocol. In the Java world, standards such as JCA allow adapters to 
be created in a vendor-neutral manner. 

Data format and transformation: To avoid every adapter having to convert 
data to/from every other applications’ formats, EAI systems usually stipulate an 
application-independent (or common) data format. The EAI system usually 
provides a data transformation service as well to help convert between 
application-specific and common formats. This is done in two steps: the adapter 
converts information from the application's format to the bus's common format. 
Then, semantic transformations are applied on this (converting zip codes to city 
names, splitting/merging objects from one application into objects in the other 
applications, and so on). 

Integration modules: An EAI system could be participating in multiple 
concurrent integration operations at any given time, each type of integration 
being processed by a different integration module. Integration modules 
subscribe to events of specific types and process notifications that they receive 
when these events occur. These modules could be implemented in different 
ways: on Java-based EAI systems, these could be web applications or EJBs or 
even POJOs that conform to the EAI system's specifications. 

Support for transactions: When used for process integration, the EAI system 
also provides transactional consistency across applications by executing all 
integration operations across all applications in a single overarching distributed 
transaction (using two-phase commit protocols or compensating transactions). 


Communication architectures 


Currently, there is a lot of variation of thought on what constitutes the best infrastructure, 
component model, and standards structure for Enterprise application integration. There seems to 
be consensus that four things are essential for a modern enterprise application architecture: 

1. There needs to be a centralized broker that handles security, access, and communication. 
This can be accomplished through integration servers (like the School Interoperability 
Framework (SIF) Zone Integration Servers) or through similar software like the Enterprise 
bus (ESB) model which acts as a SOAP-oriented services manager. 
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2. The use of an independent data model based on a standard data structure. It appears that XML 
and the use of XML style sheets has become the de facto and in some cases de jure standard. 
3. A connector, or agent, model where each vendor, application , or interface can build a single 
component that can speak natively to that application and communicate with the centralized 
broker. 

4. A system model that defines the APIs, data flow and rules of engagement to the system such 
that components can be built to interface with it in a _ standardized way. 


Although other approaches like connecting at the database or user-interface level have been 
explored, they have not been found to scale or be able to adjust. Individual applications can 
publish messages to the centralized broker and subscribe to receive certain messages from that 
broker. Each application only requires one connection to the broker. This central control 
approach can be extremely scalable and highly evolvable. 

Enterprise Application Integration is related to middleware technologies such as message- 
oriented middleware (MOM), and data representation technologies such as XML. Other EAT 
technologies involve using web services as part of service-oriented architecture as a means of 
integration. Enterprise Application Integration tends to be data centric. In the near future, it will 
come to include content integration and business processes. 


EAI implementation pitfalls 


In 2003 it was reported that 70% of all EAI projects fail. Most of these failures are not due to 
the software itself or technical difficulties, but due to management issues. EAI[C European 
Chairman Steve Craggs has outlined the seven main pitfalls undertaken by companies using EAI 
systems and explains solutions to these problems."”! 

e Constant change 
The very nature of EAI is dynamic and requires dynamic project managers to manage 
their implementation. 
e Lack of EAI experts 
EAI requires knowledge of many issues and technical aspects. 
e Competing standards 
Within the EAI field, the paradox is that EAI standards themselves are not universal. 
e EAT is a tool paradigm 
EAI is nota tool, but rather a system and should be implemented as such. 
e Building interfaces is an art 

Engineering the solution is not sufficient. Solutions need to be negotiated with user 

departments to reach a common consensus on the final outcome. A lack of consensus on 

interface designs leads to excessive effort to map between various systems data 
requirements. 

e = Loss of detail 
Information that seemed unimportant at an earlier stage may become crucial later. 

e Accountability 

Since so many departments have many conflicting requirements, there should be clear 

accountability for the system's final structure. 
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Other potential problems may arise in these areas: 

e Emerging Requirements 
EAI implementations should be extensible and modular to allow for future changes. 

e Protectionism 
The applications whose data is being integrated often belong to different departments 
which have technical, cultural, and political reasons for not wanting to share their data 
with other departments. 


Advantages and Disadvantages 


e Advantages 

o Real time information access among systems 

o Streamlines business processes and helps raise organizational efficiency. 

o Maintains information integrity across multiple systems 

e Disadvantages 

o Prohibitively high development costs, especially for small and mid-sized 
businesses (SMBs). 

o EAI implementations are very time consuming, and need a lot of resources. 

o Require a fair amount of up front design, which many managers are not able 
to envision or not willing to invest in. Most EAI projects usually start off as 
point-to-point efforts, very soon becoming unmanageable as the number of 
applications increase. 


The Future of EAI 


EAI technologies are still being developed and there still isn’t a consensus on the ideal 
approach or the correct group of technologies a company should use. A common pitfall is to use 
other proprietary technologies that claim to be open and extensible but create vendor lock-in. 


Open Source Projects 


e Apache ActiveMQ 
Apache Camel 
Apache ServiceMix 
Bostech 
ESB.NET 
Jitterbit Open Source Integration 
Virtuoso Universal Server 


Instant messaging 


Instant messaging (IM) is a form of real-time communication between two or more people 


based on typed text. The text is conveyed via computers connected over a network such as the 
Internet. 
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Overview 


Instant messaging requires an instant messaging client that connects to an instant messaging 
service. Instant messaging differs from e-mail in that conversations happen in real-time. A 
multiprotocol instant messaging application allows one client to connect to multiple IM 
networks. 

Instant messaging services owe many ideas to an older and still popular online chat medium 
named Internet Relay Chat (IRC). In early instant messaging programs, each letter appeared 
when it was typed, and when letters were deleted to correct typos this was also seen in real time. 
This made it more like a telephone conversation than exchanging letters. In modern instant 
messaging programs, the other party in the conversation generally only sees each line of text 
right after a new line is started. Most instant messaging applications also include the ability to set 
a status message, roughly analogous to the message on a telephone answering machine. 

In early instant messaging programs each character appeared when it was typed. The UNIX 
"talk" command shown in these screenshots was popular in the 1980s and early 1990s. 


Benefits 


Instant messaging offers real-time communication and allows easy collaboration, which 
might be considered more akin to genuine conversation than email's "letter" format. In contrast 
to e-mail, the parties know whether the peer is available. Most systems allow the user to set an 
online status or away message so peers are notified when the user is available, busy, or away 
from the computer. On the other hand, people are not forced to reply immediately to incoming 
messages. For this reason, some people consider communication via instant messaging to be less 
intrusive than communication via phone. However, not all popular systems allow the sending of 
messages to people not currently logged on (offline messages), thus removing much of the 
difference between IM and email. 

Instant messaging allows instantaneous communication between a number of parties 
simultaneously, by transmitting information quickly and efficiently, featuring immediate receipt 
of acknowledgment or reply. In certain cases IM involves additional features, which make it 
even more popular, i.e. to see the other party, e.g. by using web-cams, or to talk directly for free 
over the internet. 

It is possible to save a conversation for later reference. Instant messages typically are may be 
logged in a local message history which closes the gap to the persistent nature of e-mails and 
facilitates quick exchange of information like URLs or document snippets (which can be 
unwieldy when communicated via telephone). 


History 


Instant messaging applications began to appear in the 1970s on multi-user operating systems 
like UNIX, initially to facilitate communication with other users logged in to the same machine, 
then on the local network, and subsequently across the Internet. Some of these used a peer-to- 
peer protocol (eg talk, ntalk and ytalk), while others required peers to connect to a server. 
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Because all of these protocols were based inside a console window, most of those discovering 
the Internet in the mid-1990s and equating it with the web tended not to encounter them. 

In the last half of the 1980s and into the early 1990s, the Quantum Link online service for 
Commodore 64 computers offered user-to-user messages between currently connected customers 
which they called "On-Line Messages" (or OLM for short). Quantum Link's better known later 
incarnation, America Online, offers a similar product under the name "AOL Instant Messages" 
(AIM). While the Quantum Link service ran on a Commodore 64, using only the Commodore's 
PETSCII text-graphics, the screen was visually divided up into sections and OLMs would appear 
as a yellow bar saying "Message From:" and the name of the sender along with the message 
across the top of whatever the user was already doing, and presented a list of options for 
responding.[1] As such, it could be considered a sort of GUI, albeit much more primitive than 
the later Unix, Windows and Macintosh based GUI IM programs. OLMs were what Q-Link 
called "Plus Services" meaning they charged an extra per-minute fee on top of the monthly Q- 
Link access costs. 

Modern, Internet-wide, GUI-based messaging clients, as they are known today, began to take 
off in the mid 1990s with ICQ (1996) being the first, followed by AOL Instant Messenger (AOL 
Instant Messenger, 1997). AOL later acquired Mirabilis, the creators of ICQ. A few years later 
ICQ (by now owned by AOL) was awarded two patents for instant messaging by the U.S. patent 
office. Meanwhile, other companies developed their own applications (Yahoo, MSN, Excite, 
Ubique, IBM), each with its own proprietary protocol and client; users therefore had to run 
multiple client applications if they wished to use more than one of these networks. 

In 2000, an open source application and open standards-based protocol called Jabber was 
launched. Jabber servers could act as gateways to other IM protocols, reducing the need to run 
multiple clients. Modern multi-protocol clients such as Pidgin, Trillian, Adium and Miranda can 
use any of the popular IM protocols without the need for a server gateway. 

Recently, many instant messaging services have begun to offer video conferencing features, 
Voice Over IP (VoIP) and web conferencing services. Web conferencing services integrate both 
video conferencing and instant messaging capabilities. Some newer instant messaging companies 
are offering desktop sharing, IP radio, and IPTV to the voice and video features. 

The term "instant messenger" is a service mark of Time Warner[2] and may not be used in 
software not affiliated with AOL in the United States. For this reason, the instant messaging 
client formerly known as Gaim or gaim announced in April 2007 that they would be renamed 
"Pidgin"®), 


Cooperation 


There have been several attempts to create a unified standard for instant messaging: IETF's 
SIP (Session Initiation Protocol) and SIMPLE (SIP for Instant Messaging and Presence 
Leveraging Extensions), APEX (Application Exchange), Prim (Presence and Instant Messaging 
Protocol), the open XML-based XMPP (Extensible Messaging and Presence Protocol), more 
commonly known as Jabber and OMA's (Open Mobile Alliance) IMPS (Instant Messaging and 
Presence Service) created specifically for mobile devices. 

Most attempts at creating a unified standard for the major IM providers (AOL, Yahoo! and 
Microsoft) have failed and each continues to use its own proprietary protocol. 
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However, while discussions at IETF were stalled, Reuters head of collaboration services, 
David Gurle (the founder of Microsoft's Real Time Communication and Collaboration business), 
surprised everybody by signing the first inter-service provider connectivity agreement on 
September 2003. This historic agreement enabled AIM, ICQ and MSN Messenger users to talk 
with Reuters Messaging counterparts and vice-versa against an access fee. Following this 
breakthrough agreement between networks Microsoft, Yahoo! and AOL came to a deal where 
Microsoft's Live Communication Server 2005 (which is interestingly also used by Reuters for its 
Reuters Messaging service) users would also have the possibility to talk to public instant 
messaging users. This deal settled once for all the protocol for interconnectivity in the market as 
SIP/SIMPLE and established a connectivity fee for accessing public instant messaging clouds. 
Separately, on October 13, 2005 Microsoft and Yahoo! announced that by (the Northern 
Hemisphere) summer of 2006 they would interoperate using SIP/SIMPLE which is followed on 
December 2005 by the AOL and Google strategic partnership deal where Google Talk users 
would be able to talk with AIM and ICQ users provided they have an identity at AOL. 

There are two ways to combine the many disparate protocols: 

1. One way is to combine the many disparate protocols inside the IM client 
application. 

2. The other way is to combine the many disparate protocols inside the IM server 
application. This approach moves the task of communicating to the other 
services to the server. Clients need not know or care about other IM protocols. 
For example, LCS 2005 Public IM Connectivity. This approach is popular in 
Jabber/XMPP servers however the so-called transport projects suffer the same 
reverse engineering difficulties as any other project involved with closed 
protocols or formats. 

Some approaches, such as that adopted by the Sonork enterprise IM software or the 
Jabber/XMPP network or Winpopup LAN Messenger, allow organizations to create their own 
private instant messaging network by enabling them to limit access to the server (often with the 
IM network entirely behind their firewall) and administer user permissions. Other corporate 
messaging systems allow registered users to also connect from outside the corporation LAN, by 
using a secure firewall-friendly HTTPS based protocol. Typically, a dedicated corporate IM 
server has several advantages such as pre-populated contact lists, integrated authentication, and 
better security and privacy. 

Some networks have made changes to prevent them from being utilized by such multi- 
network IM clients. For example, Trillian had to release several revisions and patches to allow its 
users to access the MSN, AOL, and Yahoo! networks, after changes were made to these 
networks. The major IM providers typically cite the need for formal agreements as well as 
security concerns as reasons for making these changes. 


Mobile Instant Messaging 


Mobile Instant Messaging (MIM) is a presence enabled messaging service that aims to 
transpose the desktop messaging experience to the usage scenario of being on the move. While 
several of the core ideas of the desktop experience on one hand apply to a connected mobile 
device, others do not: Users usually only look at their phone's screen — presence status changes 
might occur under different circumstances as happens at the desktop, and several functional 
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limits exist based on the fact that the vast majority of mobile communication devices are chosen 
by their users to fit into the palm of their hand. 

Some of the form factor and mobility related differences need to be taken into account in 
order to create a really adequate, powerful and yet convenient mobile experience: radio 
bandwidth, memory size, availability of media formats, keypad based input, screen output, CPU 
performance and battery power are core issues that desktop device users and even nomadic users 
with connected notebooks are usually not exposed to. 

Several formerly untackled issues have been identified and addressed within IMPS, which 
was developed as part of an early mobile telephone industry initiative to kick off a broader usage 
of mobile instant messaging. The Open Mobile Alliance has taken over this standard, formerly 
called Wireless Village, as IMPS V1.0 in November 2002. Since then this standards has been 
further developed to IMPS V1.3, the latest candidate for release, and is expected to be released 
before the end of 2006. 

There are downloadable mobile applications offered by different independent developers that 
allow users to chat within public (MSN, Yahoo! , Google Talk, AIM, ICQ) and corporate (LCS, 
Sametime, Reuters) IM services from mobile devices. 

Among the advantages of using such IM clients over SMS are: IM clients use data instead of 
SMS text messages; IM-like chat mode, faster and quicker messaging. Some IM software allows 
group communication. 

Several large scale mobile telephone industry companies are planning to jointly deliver a 
ubiquitous, interoperable presence enabled messaging service, built according to interoperability 
recommendations developed in the GSM Association."! Considering these organisations are 
jointly representing approximately 1.5 billion active Short Text Messaging (SMS) users, it 
remains to be seen if such an initiative may also help to drive the different industry factions to 
agree on a truly interoperable approach at least for Mobile Instant Messaging sometime in the 
not too far future. 

In the meantime, other developments have proposed usage of downloadable applications with 
the intention to create their own approach to IM that runs on most mobile phones worldwide. 
Essentially, several of these clients are Java applications are instantly downloaded and then 
connected to back-end servers through GPRS/3G Internet Channels. Some of the 
implementations can connect to other IM services. 


Effects on people with an auditory or speech disability 


Instant messaging opens new methods of spontaneous communication for people that have an 
impairment in hearing, auditory processing, or speech. It is considered by many a powerful way 
to allow equal opportunities in communication, without the aid of special devices or services 
designed for users with hearing loss. 


Friend-to-friend networks 
Instant Messaging may be done in a Friend-to-friend network, in which each node connects 


to the friends on the friendslist. This allows to communicate to friends of friends and build 
chatrooms for instant messages with all friends on that network. 
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Business application 


Instant messaging has proven to be similar to personal computers, e-mail, and the WWW, in 
that its adoption for use as a business communications medium was driven primarily by 
individual employees using consumer software at work, rather than by formal mandate or 
provisioning by corporate information technology departments. Tens of millions of the consumer 
IM accounts in use are being used for business purposes by employees of companies and other 
organizations. 

In response to the demand for business-grade IM and the need to ensure security and legal 
compliance, a new type of instant messaging, called "Enterprise Instant Messaging" ("EIM") was 
created when Lotus Software launched Lotus Sametime in 1999. Microsoft followed suit shortly 
thereafter with Microsoft Exchange Instant Messaging, and later created a new platform called 
Microsoft Office Live Communications Server. Since then, both IBM Lotus and Microsoft have 
introduced federation between their EIM systems and some of the public IM networks thus 
employees may use a single interface to both their internal EIM system and their buddies on 
AOL, MSN, and Yahoo!. Current leading EIM platforms include IBM Lotus Sametime, 
Microsoft Office Live Communications Server, and Jabber XCP. 

The adoption of IM across corporate networks outside of the control of IT organizations 
creates many risks and liabilities for companies who do not effectively manage and support IM 
use. Companies implement specialized IM archiving and security products and services like 
those from Secure Computing, Akonix, Surfcontrol, and ScanSafe to mitigate these risks and 
provide safe, secure, productive instant messaging capabilities to their employees. 

On the other hand, the informal usage of instant messaging by the younger generation has 
also affected the quality of messages as these youngsters adapt the habits that they are used to in 
instant messaging into the workplace. The tendency to mis-spell, the use of informal language, 
emoticons and the shortening of longer or commonly used words whilst casual chatting online 
with friends might unconsciously seep into the more formal and serious conversations in the 
workplace. Although instant messaging has made it more convenient for faster relaying of 
messages, certain guidelines and etiquette rules should be observed to fully utilise the usability 
of this important tool. 


Risks and liabilities 


Although instant messaging delivers many benefits, it also carries with it certain risks and 
liabilities, particularly when used in workplaces, although there are not that many at all. Among 
these are: 

e Security risks (e.g. IM used to infect computers with spyware, viruses, trojans, 
worms) 

e Compliance risks 

e Inappropriate use 

e Intellectual property leakage 

Hackers' use of instant messaging networks to deliver malicious code has grown consistently 
from 2004 to the present, with the number of discrete attacks listed by the IM Security Center'! 
having grown 15% from 347 attacks in 2005 to 406 in 2006. Hackers use two methods of 
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delivering malicious code through IM: delivery of virus, trojan, or spyware within an infected 
file, and the use of "socially engineered" text with a web address that entices the recipient to 
click on a URL that connects him or her to a website that then downloads malicious code. 
Viruses, worms, and trojans typically propagate by sending themselves rapidly through the 
infected user's buddy list. An effective attack using a "poison URL" may reach tens of thousands 
of people in minutes when each person's buddy list receives messages appearing to be from a 
trusted friend. The recipients click on the web address, and the entire cycle starts again. 
Infections may range from nuisance to criminal, and are becoming more sophisticated each year. 

In addition to the malicious code threat, the use of instant messaging at work also creates a 
risk of non-compliance to laws and regulations governing the use of electronic communications 
in businesses. In the United States alone there are over 10,000 laws and regulations related to 
electronic messaging and records retention.’ The more well-known of these include the 
Sarbanes-Oxley Act, HIPAA, and SEC 17a-3. Recent changes to Federal Rules of Civil 
Procedure, effective December 1, 2006, create a new category for electronic records which may 
be requested during discovery (law) in legal proceedings. Most countries around the world also 
regulate the use of electronic messaging and electronic records retention in similar fashion to the 
United States. The most common regulations related to IM at work involve the need to produce 
archived business communications to satisfy government or judicial requests under law. Many 
instant messaging communications fall into the category of business communications that must 
be archived and retrievable. 

Organizations of all types must protect themselves from the liability of their employees' 
inappropriate use of IM. The informal, immediate, and ostensibly anonymous nature of instant 
messaging makes it a candidate for abuse in the workplace. The topic of inappropriate IM use 
became front page news in October 2006 when Congressman Mark Foley resigned his seat after 
admitting sending offensive instant messages of a sexual nature to underage former House pages 
from his Congressional office PC. The Mark Foley Scandal led to media coverage and 
mainstream newspaper articles warning of the risks of inappropriate IM use in workplaces. In 
most countries, corporations have a legal responsibility to ensure harassment-free work 
environment for employees. The use of corporate-owned computers, networks, and software to 
harass an individual or spread inappropriate jokes or language creates a liability for not only the 
offender but also the employer. A survey by IM archiving and security provider Akonix Systems, 
Inc. in March 2007 showed that 31% of respondents had been harassed over IM at work."! 
Companies now include instant messaging as an integral component of their policies on 
appropriate use of the World Wide Web, email, and other corporate assets. 


Security and archiving 


In the early 2000s, a new class of IT security provider emerged to provide remedies for the 
risks and liabilities faced by corporations who chose to use IM for business communications. 
The IM security providers created new products to be installed in corporate networks for the 
purpose of archiving, content-scanning, and security-scanning IM traffic moving in and out of 
the corporation. Similar to the e-mail filtering vendors, the IM security providers focus on the 
risks and liabilities described above. 

With rapid adoption of IM in the workplace, demand for IM security products began to grow 
in the mid-2000s. By 2007, the preferred platform for the purchase of security software had 
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become the "appliance", according to IDC, who estimate that by 2008, 80% of network security 
products will be delivered via an appliance." 


User base 


Note that many of the numbers listed in this section are not directly comparable, and some 
are speculative. Some instant messaging systems are distributed among many different instances 
and thus difficult to measure in total (e.g. Jabber). While some numbers are given by the owners 
of a complete instant messaging system, others are provided by commercial vendors of a part of 
a distributed system. Some companies may be motivated to inflate their numbers in order to 
increase advertisement earnings or to attract partners, clients, or customers. Importantly, some 
numbers are reported as the number of "active" users (without a shared standard of that activity), 
others indicate total user accounts, while others indicate only the users logged in during an 
instance of peak usage. 

e AIM: 53 million active users (September 2006), "over 100 million" total 
(January 2006). 

e Jabber: between 40 and 50 million (January 2007). Note that this number is 
based on calculations of Jabber Inc ("nearly 10 million open source users") 
which differ from those of Process-One ("Our total deployments account for 
more than 20 millions of accounts"). Process-One is a company providing 
services based on the Jabber server software ejabberd. Accordingly, as there are 
many other open source servers (some also with companies behind it), the 
number provided by Jabber Inc is probably too small. If we presume ejabberd 
has a 40% market share amongst public and non-public open source server 
deployments, there are 50 million of users using an open source server. This 
would mean, including Jabber Inc's numbers, that there are around 90 million of 
Jabber users instead of 50 million. 

e Ebuddy (MSN, Yahoo! and AIM) : 35 million users (including 4 million mobile) 
(October 2006) 

e MSN: 27.2 million active (September 2006), 155 million total (April 2005). 

e Yahoo! Messenger: 22 million users (September 2006). 

e QQ: 20 million peak online users, 221 million "active"[9] (majority in China) 
(July 2006). 

e Sametime: 15 million (enterprise) users (undated) 

e Skype: 9 million peak online (January 2007), 137 million total (January 2007). 

e Xfire: 6.1 million users (January 2007) 

e Gadu-Gadu: 5.6 million users (June 2006). 

e ICQ: 4 million active (September 2006). 

e Paltalk: 3.3 million unique visitors per month (August 2006). 

e Mxit: 3 million users (majority in South Africa, more than 200,000 
international) (31 January 2007). Note that these users are part of the Jabber 
user base as MXit federates with the Jabber network. 

e PSYC: 1 million users, daily (majority in Brazil) (February 2007). Total amount 
of users cannot be estimated due to the decentralized nature of the protocol. 
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e Meebo (Yahoo!, MSN, Jabber, AIM and ICQ): 1 million users (October 2006) 
e IMVU: 1 million users (June 2007) 


Internet search 


A search engine is an information retrieval system designed to help find information stored 
on a computer system, such as on the World Wide Web, inside a corporate or proprietary 
network, or in a personal computer. The search engine allows one to ask for content meeting 
specific criteria (typically those containing a given word or phrase) and retrieves a list of items 
that match those criteria. This list is often sorted with respect to some measure of relevance of 
the results. Search engines use regularly updated indexes to operate quickly and efficiently. 

Without further qualification, search engine usually refers to a Web search engine, which 
searches for information on the public Web. Other kinds of search engine are enterprise search 
engines, which search on intranets, personal search engines, and mobile search engines. Different 
selection and relevance criteria may apply in different environments, or for different uses. 

Some search engines also mine data available in newsgroups, databases, or open directories. 
Unlike Web directories, which are maintained by human editors, search engines operate 
algorithmically or are a mixture of algorithmic and human input. 


History 


The very first tool used for searching on the Internet was Archie.[1] The name stands for 
"archive" without the "v". It was created in 1990 by Alan Emtage, a student at McGill University 
in Montreal. The program downloaded the directory listings of all the files located on public 
anonymous FTP (File Transfer Protocol) sites, creating a searchable database of file names; 
however, Archie did not index the contents of these files. 

The rise of Gopher (created in 1991 by Mark McCahill at the University of Minnesota) led to 
two new search programs, Veronica and Jughead. Like Archie, they searched the file names and 
titles stored in Gopher index systems. Veronica (Very Easy Rodent-Oriented Net-wide Index to 
Computerized Archives) provided a keyword search of most Gopher menu titles in the entire 
Gopher listings. Jughead (Jonzy's Universal Gopher Hierarchy Excavation And Display) was a 
tool for obtaining menu information from specific Gopher servers. While the name of the search 
engine "Archie" was not a reference to the Archie comic book series, "Veronica" and "Jughead" 
are characters in the series, thus referencing their predecessor. 

Timeline 

Note: "Launch" refers only to 

web 

availability of original 

crawl-based 

web search engine results. 

Year Engine Event 

1993 Aliweb Launch 
WebCrawler Launch 


il JumpStation Launch 
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Infoseek Launch 
Lycos Launch 
Launch 
AltaVista (part 
DEC) 
Excite Launch 
Dogpile 
Inktomi 
HotBot 
1997 
1998 
1995 
1996 
1999 


Launch 
Founded 
Founded 


Ask Jeeves Founded 


Northern 
Light 
Google 
AlltheWeb 
Naver 
Teoma 
Vivisimo 
2000 

2003 


2004 


2005 
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Launch 


Launch 
Launch 
Launch 
Founded 
Founded 
Baidu 
Info.com 
Yahoo! 
Search 
A9.com 
MSN 
Search 
Ask.com 
AskMeNow 


2006 


Founded 

Launch 

Final 

launch 

Launch 

Final 

launch 

Launch 

Launch 

wikiseek Founded 
Quaero Founded 
Ask.com Launch 


Live 
Séarahi Launch 


Beta 
ChaCha Conch 
Beta 


uintura 
Q Launch 


wikiseek Launched 
Tokenizer Launched 

The first Web search engine was Wandex, a now-defunct index collected by the World Wide 
Web Wanderer, a web crawler developed by Matthew Gray at MIT in 1993. Another very early 
search engine, Aliweb, also appeared in 1993, and still runs today. JumpStation (released in early 
1994) used a crawler to find web pages for searching, but search was limited to the title of web 
pages only. One of the first "full text" crawler-based search engine was WebCrawler, which 
came out in 1994. Unlike its predecessors, it let users search for any word in any webpage, which 
became the standard for all major search engines since. It was also the first one to be widely 
known by the public. Also in 1994 Lycos (which started at Carnegie Mellon University) was 
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launched, and became a major commercial endeavor. For a more detailed history of early search 
engines, see "!, 

Soon after, many search engines appeared and vied for popularity. These included Excite, 
Infoseek, Inktomi, Northern Light, and AltaVista. In some ways, they competed with popular 
directories such as Yahoo!. Later, the directories integrated or added on search engine 
technology for greater functionality. 

Search engines were also known as some of the brightest stars in the Internet investing frenzy 
that occurred in the late 1990s. Several companies entered the market spectacularly, receiving 
record gains during their initial public offerings. Some have taken down their public search 
engine, and are marketing enterprise-only editions, such as Northern Light. 


Google 


Around 2001, the Google search engine rose to prominence. Its success was based in part on 
the concept of link popularity and PageRank. The number of other websites and webpages that 
link to a given page is taken into consideration with PageRank, on the premise that good or 
desirable pages are linked to more than others. The PageRank of linking pages and the number of 
links on these pages contribute to the PageRank of the linked page. This makes it possible for 
Google to order its results by how many websites link to each found page. Google's minimalist 
user interface is very popular with users, and has since spawned a number of imitators. 

Google and most other web engines utilize not only PageRank but more than 150 criteria to 
determine relevancy.'! The algorithm "remembers" where it has been and indexes the number of 
cross-links and relates these into groupings. PageRank is based on citation analysis that was 
developed in the 1950s by Eugene Garfield at the University of Pennsylvania. Google's founders 
cite Garfield's work in their original paper. In this way virtual communities of webpages are 
found. Teoma's search technology uses a communities approach in its ranking algorithm. NEC 
Research Institute has worked on similar technology. Web link analysis was first developed by 
Jon Kleinberg and his team while working on the CLEVER project at IBM's Almaden Research 
Center. Google is currently the most popular search engine. 


Yahoo! Search 


The two founders of Yahoo!, David Filo and Jerry Yang, Ph.D. candidates in Electrical 
Engineering at Stanford University, started their guide in a campus trailer in February 1994 as a 
way to keep track of their personal interests on the Internet. Before long they were spending 
more time on their home-brewed lists of favourite links than on their doctoral dissertations. 
Eventually, Jerry and David's lists became too long and unwieldy, and they broke them out into 
categories. When the categories became too full, they developed subcategories ... and the core 
concept behind Yahoo! was born. In 2002, Yahoo! acquired Inktomi and in 2003, Yahoo! 
acquired Overture, which owned AlltheWeb and AltaVista. Despite owning its own search 
engine, Yahoo! initially kept using Google to provide its users with search results on its main 
website Yahoo.com. However, in 2004, Yahoo! launched its own search engine based on the 
combined technologies of its acquisitions and providing a service that gave pre-eminence to the 
Web search engine over the directory. 
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Microsoft 


The most recent major search engine is MSN Search (evolved into Live Search), owned by 
Microsoft, which previously relied on others for its search engine listings. In 2004, it debuted a 
beta version of its own results, powered by its own web crawler (called msnbot). In early 2005 , 
it started showing its own results live, and ceased using results from Inktomi, now owned by 
Yahoo!. In 2006, Microsoft migrated to a new search platform - Live Search, retiring the "MSN 


Search" name in the process. 


Baidu 


Due to the difference between hanzi and the Roman alphabet, the Chinese search market did 


not boom until the introduction of Baidu in 2000. 


Top Providers 


Top U.S. Search Providers by Searches, May 
2007: 


Provider 7 
Google 4,033,277 
Tanoe 1,540,949 
MSN/Windows Live 605,400 
AOL 381,961 
Ask.com 142,418 
My Web Search 61,784 
Comcast 34,908 
EarthLink 33.461 
My Ay 30,122 
Dogpile.com 26,295 
Other 275,365 
All search 7,165,940 


Source: Nielsen//NetRatings, 2007 


Challenges faced by search engines 


Share of Total Searches 
(%) 
56.3 
pA Wess 
8.4 
5.3 
2.0 
0.9 
0.5 
0.5 
0.4 
0.4 
3.8 
100.0 


e The Web is growing much faster than any present-technology search engine can 
possibly index see distributed web crawling. In 2006, some users found major 
search-engines became slower to index new webpages. Time to index in MSN 
Search, slowing down in Dec-2005 & Jan-2006:(18-Jan-2006). 

e Many webpages are updated frequently, which forces the search engine to 


revisit them periodically. 


e The Web search queries one can make are currently limited to searching for key 
words, which may result in many Type I and type IJ errors positives, especially 
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using the default whole-page search. Better results might be achieved by using a 
Proximity search (text) option with a search-bracket to limit matches within a 
paragraph or phrase, rather than matching random words scattered across 
large pages. Another alternative is using human operators to do the researching 
for the user with organic search engines. 

Dynamically generated sites may be slow or difficult to index, or may result in 
excessive results, perhaps generating 500 times more webpages than average. 
Example: for a dynamic webpage which changes content based on entries 
inserted from a database, a search-engine might be requested to index 50,000 
static webpages for 50,000 different parameter values passed to that dynamic 
webpage. The indexing is numerous in the dynamic webpages, they can also be 
shown by logical thinking: if one parameter-value generates 1 indexed 
webpage, 10 generate 10, and 1,000 parameter-values generate 1,000 
webpages, etc. Also, some dictionary-page websites are indexed using dynamic 
pages: for example, search for page-counts of URLs containing variations of 
"dictionary.*" and observe the page-totals reported by the search-engines, 
perhaps in excess of 50,000 pages. 

Many dynamically generated websites are not indexable by search engines; this 
phenomenon is known as the invisible web. There are list of search engines that 
specialize in crawling the invisible web by crawling sites that have dynamic 
content, require forms to be filled out, or are password protected. 

Relevancy: sometimes the engine can't get what the person is looking for. 

Some search-engines do not rank results by relevance, but by the amount of 
money the matching websites pay. 

In 2006, hundreds of generated websites used tricks to manipulate a search- 
engine to display them in the higher results for numerous keywords. This can 
lead to some search engine results being polluted with linkspam or bait-and- 
switch pages which contain little or no information about the matching phrases. 
The more relevant webpages are pushed further down in the results list, 
perhaps by 500 entries or more. For example, many spammers create websites 
containing random sequences of high-traffic keywords, often with misspellings 
in order to attract a higher ranking on a search engine. 

Secure pages content hosted on HTTPS URLs pose a challenge for crawlers 
which either can't browse the content for technical reasons or won't index it for 
privacy reasons. 


How search engines work 


A search engine operates, in the following order 


1. 
Zz 
3. 


Web crawling 
Indexing 
Searching 


Web search engines work by storing information about a large number of web pages, which 
they retrieve from the WWW itself. These pages are retrieved by a Web crawler (sometimes also 
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known as a spider) — an automated Web browser which follows every link it sees. Exclusions 
can be made by the use of robots.txt. The contents of each page are then analyzed to determine 
how it should be indexed (for example, words are extracted from the titles, headings, or special 
fields called meta tags). Data about web pages are stored in an index database for use in later 
queries. Some search engines, such as Google, store all or part of the source page (referred to as 
a cache) as well as information about the web pages, whereas others, such as AltaVista, store 
every word of every page they find. This cached page always holds the actual search text since it 
is the one that was actually indexed, so it can be very useful when the content of the current page 
has been updated and the search terms are no longer in it. This problem might be considered to 
be a mild form of linkrot, and Google's handling of it increases usability by satisfying user 
expectations that the search terms will be on the returned webpage. This satisfies the principle of 
least astonishment since the user normally expects the search terms to be on the returned pages. 
Increased search relevance makes these cached pages very useful, even beyond the fact that they 
may contain data that may no longer be available elsewhere. 

When a user enters a query into a search engine (typically by using key words), the engine 
examines its index and provides a listing of best-matching web pages according to its criteria, 
usually with a short summary containing the document's title and sometimes parts of the text. 
Most search engines support the use of the boolean operators AND, OR and NOT to further 
specify the search query. Some search engines provide an advanced feature called proximity 
search which allows users to define the distance between keywords. 

The usefulness of a search engine depends on the relevance of the result set it gives back. 
While there may be millions of webpages that include a particular word or phrase, some pages 
may be more relevant, popular, or authoritative than others. Most search engines employ 
methods to rank the results to provide the "best" results first. How a search engine decides which 
pages are the best matches, and what order the results should be shown in, varies widely from 
one engine to another. The methods also change over time as Internet usage changes and new 
techniques evolve. 

Most Web search engines are commercial ventures supported by advertising revenue and, as 
a result, some employ the controversial practice of allowing advertisers to pay money to have 
their listings ranked higher in search results. Those search engines which do not accept money 
for their search engine results make money by running search related ads alongside the regular 
search engine results. The search engines make money every time someone clicks on one of 
these ads. 

The vast majority of search engines are run by private companies using proprietary 
algorithms and closed databases, though some are open source. 


Storage costs and crawling time 


Storage costs are not the limiting resource in search engine implementation. Simply storing 
10 billion pages of 10 kbytes each (compressed) requires 1OOTB and another 100 TB or so for 
indexes, giving a total hardware cost of under $200k: 100 cheap PCs each with four 500GB disk 
drives. 

However, a public search engine requires considerably more resources than this to calculate 
query results and to provide high availability. Also, the costs of operating a large server farm are 
not trivial. 
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Crawling 10 billion pages with 100 machines crawling at 100 pages/second would take | 
million seconds, or 11.6 days on a very high capacity Internet connection. Most search engines 
crawl a small fraction of the Web (10-20% pages) at around this frequency or better, but also 
crawl dynamic websites (e.g. news sites and blogs) at a much higher frequency. 


Geospatially enabled search engines 


A recent enhancement to search engine technology is the addition of geocoding and 
geoparsing to the processing of the ingested documents being indexed, to enable searching 
within a specified locality (or region). Geoparsing attempts to match any found references to 
locations and places to a geospatial frame of reference, such as a street address, gazetteer 
locations, or to an area (such as a polygonal boundary for a municipality).[citation needed] 
Through this geoparsing process, latitudes and longitudes are assigned to the found places, and 
these latitudes and longitudes are indexed for later spatial query and retrieval. This can enhance 
the search process tremendously by allowing a user to search for documents within a given map 
extent, or conversely, plot the location of documents matching a given keyword to analyze 
incidence and clustering, or any combination of the two. 


Social search 


Social search engines are a type of vertical search engine found on many websites. 


References 


e "Seeking Better Web Searches," Scientific American magazine (February 2005 
Issue). 


Web service specifications 


There are a variety of specifications associated with web services. These specifications are in 
varying degrees of maturity and are maintained or supported by various standards bodies and 
entities. Specifications may complement, overlap, and compete with each other. Web service 
specifications are occasionally referred to collectively as "WS-*", though there is not a single 
managed set of specifications that this consistently refers to, nor a recognized owning body 
across them all. The reference term "WS-*" is more of a general nod to the fact that many 
specifications are named with "WS-" as their prefix. This page includes many of the 
specifications that might be considered a part of "WS-*". 


Web Service Standards Listings 


These sites contain documents and links about the different Web Services standards 
identified on this page. 

e IBM's Web Services Standards Page 

Microsoft's Web Services Standards Page 

World Wide Web Consortium's Web Services Activity 
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innoQ's WS-Standards Poster 
OASIS Standards and Other Approved Work 
XML CoverPages 


Directory access 


e Universal Description, Discovery, and Integration (UDDI 1.0, 2.0 and 3.0) 
ebXML 
WSFL 
WS-Policy 
WS-PolicyAssertions 
WS-PolicyAttachment 
WS-Policy Framework 
WS-SecurityPolicy 
WS-Discovery 
WS-Inspection 


Service Description (Metadata) 


e Web Services Description Language (WSDL) from the W3C 
Web Services Semantics (WSDL-S) 
XINS provides a POX-style Web service specification format 
WS-MetadataExchange 
WS-Resource Framework (WSRF) 


Messaging and Function Calls 


e Simple Object Access Protocol (SOAP) 
SOAP with Attachments 
SOAP Message Transmission Optimization Mechanism 
SOAP-over-UDP 
XML-RPC (XML-based Remote Procedure Call) 
WS-Eventing 
WS-Addressing 
WS-Routing (specification superseded by WS-Addressing) 
WS-Referral (specification superseded by WS-Addressing) 
MTOM 
WS-Enumeration 
WS-Transfer 


Web Services Interoperability organization (WS-I) Specifications 


These specifications provide additional information to improve interoperability between 
vendor implementations. 
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e WS-I Basic Profile 
WS-I Basic Security Profile 
Simple Soap Binding Profile 


Business Process Specifications 


e WS-BPEL 
WS-CDL WS Choreography Definition Language is an XML-based language that 
describes peer-to-peer collaborations of Web Services participants. 


Security Specifications 


e XML Signature 
XML Encryption 
XML Key Management (XKMS) 
WS-Security 
WS-SecureConversation 
WS-SecurityPolicy 
WS-Trust 
WS-Federation 
WS-Federation Active Requestor Profile 
WS-Federation Passive Requestor Profile 
Web Services Security Kerberos Binding 
Web Single Sign-On Interoperability Profile 
Web Single Sign-On Metadata Exchange Protocol 
Security Assertion Markup Language (SAML) (used for the exchange of 
authentication and authorisation information) 
XACML (may be used to describe authorization policies) 


Reliable Messaging Specifications 


e WS-ReliableMessaging 
WS-Reliability 


Transaction Specifications 


e WS-Coordination 
WS-Transaction 
WS-AtomicTransaction 
WS-BusinessActivity 


Publish-subscribe Messaging Specifications 
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e WS-Notification defines a standard Web services approach to notification using 
a topic-based publish/subscribe pattern. It is a family of documents including 
three normative specifications: 

o WS-BaseNotification 
WS-BrokeredNotification 
WS-Topics 

e WS-Eventing 


Basic XML Specifications 


e XML (eXtensible Markup Language) 
Namespaces in XML 
XML Schema 
XPath 
XML Information Set 
JAXB a Java specification; Java-XML Binding API 
XInclude 
XML Pointer 


Management Specifications 


e WS-Management / WS-Manageability 
WS-Management Catalog 
WS-ResourceTransfer 


Draft Specifications 


e WS-CAF Web Services Composite Application Framework 
WS-CDL Web Services Choreography Description Language. It is W3C 
specification that describes peer-to-peer collaborations of parties by defining, 
from a global viewpoint, their common and complementary observable 
behavior; where ordered message exchanges result in accomplishing a common 
business goal. 
WSDM Web Services Distributed Management 
WS-Provisioning Describes the APIs and Schemas necessary to facilitate 
interoperability between provisioning systems in a consistent manner using 
Web services 


Other 


e Devices Profile for Web Services (DPWS) 
Web Services for Remote Portlets (WSRP) 


List of web service protocols 
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The following is a list of Web service protocols. 

e BEEP - Blocks Extensible Exchange Protocol 
BPEL - Business Process Execution Language 
E-Business XML 
REST (Representational State Transfer) 
SOAP - outgrowth of XML-RPC, originally an acronym for Simple Object Access 
Protocol 
Universal Description, Discovery, and Integration (UDDI) 
Web Services Description Language (WSDL) 
WSFL - Web Services Flow Language (superseded by BPEL) 
WSCL - Web Services Conversation Language 
XINS Standard Calling Convention - HTTP parameters in (GET/POST/HEAD), 
POX out 
XLANG - XLANG-Specification (superseded by BPEL) 
XML-RPC - XML Remote Procedure Call 


See also 


e List of Web service specifications 


FastCGI 


FastCGI is a protocol for interfacing interactive programs with a web server. FastCGI is a 
variation on the earlier Common Gateway Interface (CGI); FastCGI's main aim is to reduce the 
overhead associated with interfacing the web server and CGI programs, allowing a server to 
handle more web page requests at once. 


History 


CGI is a protocol for interfacing external applications to web servers. CGI applications run in 
a separate process, which is created at the start of each request and torn down at the end. This 
"one process per request" model makes CGI programs very simple to implement, but limits 
efficiency and scalability. At high loads, the operating system process creation and destruction 
overhead becomes significant and limits scalability. In addition, the CGI process model limits 
resource reuse techniques (such as reusing database connections, in-memory caching, etc.). 

To address the scalability shortcomings of CGI, Open Market developed FastCGI and first 
introduced it in their webserver product in the mid-1990s. Open Market originally developed 
FastCGI in part as a competitive response to Netscape's proprietary, in-process API (NSAPI) for 
developing Web applications. 


Implementation details 


Instead of creating a new process for every request, FastCGI can use a single persistent 
process which handles many requests over its lifetime. Processing of multiple requests 
simultaneously is achieved either by using a single connection with internal multiplexing (ie. 
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multiple requests over a single connection) and/or by using multiple connections. Many such 
processes can exist, something that can increase stability and scalability. FastCGI also allows 
programs to get the web server to do certain simple operations, like reading in a file, before the 
request is handed over. Environment information and page requests are sent from the web server 
to the process over a TCP connection (for remote processes) or Unix domain sockets (for local 
processes). Responses are returned from the process to the web server over the same connection. 
The connection may be closed at the end of a response, but the web server and the process are 
left standing. 

Although initially developed by Open Market, FastCGI was implemented by a number of 
other webserver makers. The FastCGI approach, however, competed against other techniques 
which also aimed to speed and simplify server-subprogram communications, but which didn't 
follow the CGI paradigm. Apache modules such as mod_perl and mod_php appeared around the 
same time and seemed to be even better replacements for CGI, allowing closer integration with 
the core webserver. 

FastCGI languished for many years; however, later years have seen a resurgence in interest. 
Many web site administrators and programmers are finding that the separation of web 
applications from the web server in FastCGI (and the simpler SCGI) has many desirable 
advantages over embedded interpreters (mod_perl, mod_php, etc.). This separation allows server 
and application processes to be restarted independently — an important consideration for busy 
web sites. It also facilitates per-application security policies — important for ISPs and web 
hosting companies. 


Web Servers that implement FastCGI 


Note: unless stated, completeness of FastCGI implementation is unknown 
e Abyss Web Server 
e Apache HTTP Server (partial) 
o Implemented by either of the third-party modules mod_fastcgi (now 
stagnant) or mod_fcgid 
o Multiplexing of requests through a single connection is prohibited by 
Apache's design, so this isn't supported 
e aXesW3 
e Cherokee HTTP Server (2! 
e Hiawatha webserver 3! 
o Loadbalancing FastCGI support 
o Supports chrooted FastCGI servers 
e Lighttpdi(partial) 
o Multiplexing of requests through a single connection is not implemented 
e Microsoft HS [5] 
MyServer 
Nginx 
Open Market Web Server 
Roxen webserver 
Sun Java System Web Server (and predecessors) 
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WebSTAR 
Zeus 


Language bindings for the FastCGI API 


FastCGI can be implemented in any language that supports sockets. APIs exist for: 
e C / C++ 

Common Lisp: CLISP and CMUCL 

Guile Scheme 

Goanna Eiffel 

Java 

Ocaml 

Perl 

PHP 

Roadsend PHP 

Python 

Ruby 

TCL 

SmallEiffel 

Smalltalk: FasTalk and Dolphin Smalltalk 

HP BASIC for OpenVMS 

FastCGI has enabled web application portability; in contrast, applications developed for 

embedded interpreters (such as mod_python) are often tightly bound to the Apache API. Recent 
frameworks such as Ruby on Rails, Catalyst, and Django allow use with either the embedded 
interpreters (mod_ruby, mod_perl, or mod_python respectively) or with FastCGI. 


-hntaccess 


ehtaccess (Hypertext Access) is the default name of Apache's directory-level configuration 
file. It provides the ability to customize configuration directives defined in the main 
configuration file. The configuration directives need to be in .htaccess context and the user needs 
appropriate permissions. 


Error documents 


Statements such as the following can be used to configure a server to send out customized 
documents in response to client errors such as "404 Not Found" or server errors such as "503: 
Service Temporarily Unavailable" 


ErrorDocument 404 /error-pages/not-found.html ErrorDocument 503 /error-pages/service- 
unavailable.html 


Directory rules 
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A .htaccess file controls the directory it is in, plus all subdirectories. However, by placing 
additional .htaccess files in the subdirectories, this can be overruled. 


User permissions 


The user permissions for .htaccess are controlled on server level with the AllowOverride 
directive which is documented in the Apache Server Documentation. 


Other uses 


Some web developers have modified .htaccess to perform custom tasks server-side before 
serving content to the browser. Developer Shaun Inman shows it is possible to edit .htaccess to 
allow for Server Side Constants within CSS. 


Hypervideo 


Hypervideo, or hyperlinked video, is a displayed video stream that contains embedded, user 
clickable anchors,[1] allowing navigation between video and other hypermedia elements. 
Hypervideo is thus analogous to hypertext, which allows a reader to click on a word in one 
document and retrieve information from another document, or from another place in the same 
document. That is, hypervideo combines video with a non linear information structure, allowing 
a user to make choices based on the content of the video and the user's interests. 

A crucial difference between hypervideo and hypertext is the element of time. Text is 
normally static, while a video is necessarily dynamic; the content of the video changes with time. 
Consequently, hypervideo has different technical, aesthetic, and rhetorical requirements than a 
static hypertext page. For example, hypervideo might involve the creation of a link from an 
object in a video that is visible for only a certain duration. It is therefore necessary to segment 
the video appropriately and add the metadata required to link from frames - or even objects - in a 
video to the pertinent information in other media forms. 


History of Hypervideo 


Illustrating the natural progression to hypervideo from hypertext, the software Storyspace,”! a 
hypertext writing environment, employs a spatial metaphor for displaying links. Storyspace 
utilizes 'writing spaces’, generic containers for content, which link to other writing spaces. 
HyperCafe,°! a popular experimental prototype of hypervideo, made use of this tool to create 
"narrative video spaces". HyperCafe was developed as an early model of a hypervideo system, 
placing users in a virtual cafe where the user dynamically interacts with the video to follow 
different conversations. 

Video to video linking was demonstrated by the Interactive Cinema Group at the MIT Media 
Lab. Elastic Charles[4] was a hypermedia journal developed between 1988 and 1989, in which 
"micons" were placed inside a video, indicating links to other content. When implementing the 
Interactive Kon-Tiki Museum,[5] Listol used micons in order to represent video footnotes. Video 
footnotes were a deliberate extension of the literary footnote applied to annotating video, thereby 
providing continuity between traditional text and early hypervideo.[6] In 1993, Hirata et al.!7 
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considered media based navigation for hypermedia systems, where the same type of media is 
used as a query as for the media to be retrieved. For example, a part of an image (defined by 
shape, or color, for example) could link to a related image. In this approach, the content of the 
video becomes the basis of forming the links to other related content. 

HotVideo was an implementation of this kind of hypervideo, developed at IBM's China 
Research Laboratory in 1996.[8] Navigation to associated resources was accomplished by 
clicking on a dynamic object in a video. In 1997, a student project at the MIT Media Lab called 
Hypersoap[9] further developed this concept. HyperSoap was a short soap opera program in 
which a viewer could click with an enhanced remote control on objects in the video to find 
information on how they could be purchased. The company Watchpoint Media was formed in 
order to commercialize the technology involved, resulting in product called Storyteller, oriented 
towards interactive television.[10] Watchpoint Media was acquired by Goldpocket in 2003, 
which was in turn acquired by Tandberg Television in late 2005.!'" 

In 1997, the Israeli software firm Ephyx Technologies released a product called v-active,'”! 
the first commercial object based authoring system for hypervideo. This technology was not a 
success, however; Ephyx changed its name to Veon in 1999, at which time it shifted focus away 
from hypervideo to the provision of development tools for web and broadband content.""*! 

VideoClix, a hypervideo authoring tool able to dynamically track and link objects, was 
released in 2001 by eline Technologies, founded in 1999 as a provider of hypervideo 
solutions.[14] With the advantage that its videos can play back in popular video players such as 
QuickTime and Flash, this product has proven to be a commercial success. In 2006, eline 
Technologies was acquired by VideoClix Inc. 


Concepts and Technical Challenges 


Hypervideo is challenging, compared to hyperlinked text, due to the unique difficulty video 
presents in node segmentation; that is, separating a video into algorithmically identifiable, 
linkable content. 

Video, at its most basic, is a time sequence of images, which are in turn simply two 
dimensional arrays of color information. In order to segment a video into meaningful pieces 
(objects in images, or scenes within videos), it is necessary to provide a context, both in space 
and time, to extract meaningful elements from this image sequence. Humans are naturally able to 
perform this task; however, developing a method to achieve this automatically (or by algorithm) 
is a complex problem. 

And it is desirable to do this algorithmically. At an NTSC frame rate of 30 frames per 
second,!'5! even a short video of 30 seconds comprises 900 frames. The identification of distinct 
video elements would be a tedious task if human intervention were required for every frame. 
Clearly, even for moderate amounts of video material, manual segmentation is unrealistic. 

From the standpoint of time, the smallest unit of a video is the frame (the finest time 
granularity).'"| Node segmentation could be performed at the frame level - a straightforward task 
as a frame is easily identifiable. However, a single frame cannot contain video information, since 
videos are necessarily dynamic. Analogously, a single word separated from a text does not 
convey meaning. Thus it is necessary to consider the scene, which is the next level of temporal 
organization. A scene can be defined as the minimum sequential set of frames that conveys 
meaning. This is an important concept for hypervideo, as one might wish a hypervideo link to be 
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active throughout one scene, though not in the next. Scene granularity is therefore natural in the 
creation of hypervideo. Consequently, hypervideo requires algorithms capable of detecting scene 
transitions. 

Of course, one can imagine coarser levels of temporal organization. Scenes can be grouped 
together to form a narrative sequence, which in turn are grouped to form a video; from the point 
of view of node segmentation, these concepts are not as critical. Issues of time in hypervideo 
were considered extensively in the creation of the HyperCafe."*! 

Even if the frame is the smallest time unit, one can still spatially segment a video at a sub- 
frame level, separating the frame image into its constituent objects; this is necessary when 
performing node segmentation at the object level. Time introduces complexity in this case also, 
for even after an object is differentiated in one frame, it is usually necessary to follow the same 
object through a sequence of frames. This process, known as object tracking, is essential to the 
creation of links from objects in videos. Spatial segmentation of object can be achieved, for 
example, through the use of intensity gradients to detect edges, color histograms to match 
regions,"! motion detection," or a combination of these and other methods. 

Once the required nodes have been segmented and combined with the associated linking 
information, this metadata must be incorporated with the original video for playback. The 
metadata is placed conceptually in layers, or tracks, on top of the video; this layered structure is 
then presented to the user for viewing and interaction. Thus the display technology, the 
hypervideo player, should not be neglected when creating hypervideo content. For example, 
efficiency can be gained by storing the geometry of areas associated with tracked objects only in 
certain keyframes, and allowing the player to interpolate between these keyframes, as developed 
for HotVideo by IBM." Furthermore, the creators of VideoClix emphasize the fact that its 
content plays back on standard players, such as Quicktime and Flash. When one considers that 
the Flash player alone is installed on over 98% of internet enabled desktops in mature markets,'!*! 
this a perhaps a reason for the success of this product in the current arena. 


Hypervideo authoring tools 


The process of creating hypervideo content is known as authoring. Many early attempts at 
creating widely distributed authoring tools were not successful, for a variety or reasons. 
However, this field is currently enjoying a resurgence of interest, perhaps due to the greater 
availability of broadband internet. 

Most likely the most successful product in this category is VideoClix, described on its 
website as the premier and only commercially available technology for creating clickable videos. 
It is prominent in the rapidly growing domain of internet video. Tandberg Television, 
specializing in interactive television solutions, has a hypervideo system called AdPoint for video- 
on-demand. They also sell Storyteller, a product derived from the MIT project Hypersoap. 

Adivi (Add Digital Information to VIdeo) is a project of the Darmstadt University of 
Technology, Germany. They are studying the potential of hypervideo to support collaborative 
documentation. Siemens, an engineering firm, will use this technology for enhanced on-line 
training manuals. 

Adobe Flash, a popular multimedia authoring program widely used to create animated web 
content, can also be used to create hypervideo content. As Flash was not designed as a 
hypervideo authoring tool, creating such content can be difficult using Flash alone. Such added 
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functionality has been provided through outside software in the past - for example, MoVideo and 
Digital Lava.'"| However, these products are no longer sold. 

In the past, there have been a number of attempts to market hypervideo authoring software 
that is no longer available. MediaLoom,'' a product based on a Masters of Science project at the 
Georgia Institute of Technology, was an early hypervideo authoring tool. It used the Storyspace 
hypertext authoring environment to generate script files for the hypervideo engine of the 
HyperCafe. This product reached prototype stage, but was not commercially successful. Ephyx 
Technologies created v-active, the first authoring software using dynamically tracked objects in 
video. The company moved away from hypervideo, however, when it became Veon in 1999. 

Hypervideo can also be created using services provided by firms with proprietary methods, 
such as those provided by Vimation. However, this company does not licence its authoring 
software. 


The rise of hypervideo 


As the first steps in hypervideo were taken in the late 1980s, it would appear that hypervideo 
is taking unexpectedly long to realize its potential. Many interesting experiments (HyperCafe, 
HyperSoap) have not been extensively followed up on, and authoring tools are at the moment 
available from only a small number of providers. 

However, perhaps with the wider availability of broadband internet, this situation is rapidly 
changing. Interest in hypervideo is increasing, as reflected in popular blogs on the subject,!l7" as 
well as the extraordinary rise of the internet phenomenon YouTube. Furthermore, by 2010, some 
estimates have internet downloads claiming over one third of the market for on-demand video.'”! 

As the amount of video content increases and becomes available on the internet, the 
possibilities for linking video increase even faster. Digital libraries are constantly growing, of 
which video is an important part. News outlets have amassed vast video archives, which could be 
useful in education and historical research." Direct searching of pictures or videos, a much 
harder task then indexing and searching text, could be greatly facilited by hypervideo methods. 

Perhaps the most significant consequence of hypervideo will result from commercial 
advertising. Devising a business model to monetize video has proven notoriously difficult. The 
application of traditional advertising methods - for example introducing ads into video - is likely 
to be rejected by the online community, while revenue from selling advertising on video sharing 
sites has so far not been promising.'! Hypervideo offers an alternate way to monetize video, 
allowing for the possibility of creating video clips where objects link to advertising or e- 
commerce sites, or provide more information about particular products. This new model of 
advertising is less intrusive, only displaying advertising information when the user makes the 
choice by clicking on an object in a video. And since it is the user who has requested the product 
information, this type of advertising is better targeted and likely to be more effective. 


Mirror 


A mirror in computing is a direct copy of a data set. On the Internet, a mirror site is an 
exact copy of another Internet site. Mirror sites are most commonly used to provide multiple 
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sources of the same information, and are of particular value as a way of providing reliable access 
to large downloads. Mirroring is a type of file synchronization. 

Mirroring can occur locally or remotely. Locally means that a server has a second hard drive 
that stores data. A remote mirror means that a remote server contains an exact duplicate of the 
data. The second drive is called a mirrored drive. Data is written to the original drive when a 
write request is issued. Data is then copied to the mirrored drive, providing a mirror image of the 
primary drive. If one of the hard drives fails, all data is protected from loss. Mirroring is one 
level of the RAID standard. 

A live mirror is automatically updated as soon as the original is changed. 


Reasons 


Mirroring of sites occurs for a variety of reasons: 


To protect data from failure, usually in hardware. 

To preserve a website or page, especially when it is closed or is about to be 
closed. 

To allow faster downloads for users at a specific geographical location. For 
example, a U.S. server could be mirrored in Japan, allowing Japanese Internet 
users to download content faster from the local Japanese server than from the 
original American one. This may be viewed as caching on a worldwide scale. 

To counteract censorship and promote freedom of information. For example, an 
activist might post pictures on a website of a company conducting illegal 
activities or make available information on secret government activity and be 
litigated for such. Other internet users will make the content in question 
available on other servers when the legal action results in the cancellation of ISP 
or DNS services for the original activist. 

To provide access to otherwise unavailable information. For example, when the 
popular Google search engine was banned in 2002 by the People's Republic of 
China, the mirror elgooG was used as a way of effectively circumventing the ban. 
To preserve historic content. Financial constraints and/or bandwidth prevent 
the maintainers of a server from keeping older and unsupported content 
available to users who still may desire them - a mirror may be made to prevent 
this content from disappearing. 

To balance load. If one server is extremely popular a mirror may help relieve 
this load: for example if a Linux distribution is released as an ISO image onto the 
distribution developer's own server, this server may become overloaded with 
demand. Alternative download points allow the total number of download 
requests to be spread among several servers, maintaining the availability of the 
distribution. Metalink is frequently used for automatic load balancing by listing 
all mirrors. 

As a temporary measure to counterbalance a sudden, temporary increase in 
traffic. For example, Slashdotted websites will often be mirrored by a few 
slashdot posters until the article is pushed off the front page. 
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e To increase a site's ranking in a search engine by placing hyperlinks from each 
mirror to every other mirror (a technique known as link farming). This is 
viewed as unethical by most search engine administrators and websurfers. 

e Rarely, as a form of plagiarism; this is, however, usually pointless, as a website 
popular enough to be worth plagiarizing will quickly discover the copy as soon 
as one of their many readers stumbles onto the plagiarized site. 

e As a form of raising advertising revenue. Wikipedia is probably the best 
example of material released under the GNU Free Documentation License which 
is then duplicated by other companies which, unlike Wikipedia, then attempt to 
generate money from advertising, etc. An example of this is the television 
article, which is mirrored at 

http://omniknow.com/common/wiki.php?in=en&term=Television 


http://www:startlearningnow.com/TV.htm 
http://www.yourart.com/research/encyclopedia.cgi?subject=/television 
http://www.internet-encyclopedia.org/wiki.php?title=Television 
http://www.everybase.com/Television 
http://www.nationmaster.com/encyclopedia/Television 

e To serve as a method of circumventing firewalls. 


Examples 


A good example of mirroring is the well-known SourceForge.net website. The basis of the 
Sourceforge concept is, primarily, the hosting of open-source software projects, but secondarily 
the use of many different locations to achieve one goal: to maintain download availability to the 
user. Many innovative computer projects host their sites and software on SourceForge, which 
provides mirrors in several states and countries, from Dublin, Ireland to Tokyo, Japan. 

Examples of even larger mirrored networks include those of the Debian and FreeBSD 
software projects. The encyclopedia Wikipedia is mirrored at numerous locations. 


Programs 


There are numerous offline browsers that provide automated mirroring of entire sites. Some 
are oriented towards personal use, which allows browsing from a local copy — this means an 
initial waiting time but much improved load time for those pages once they're mirrored. 

Other programs are intended to be used by public mirror maintainers. 

e rsync 
e CVSup 
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Website monitoring 


Website monitoring is the process of testing or tracking (monitoring) how end-users interact 
with a website or web application. Website monitoring is often used by businesses to ensure that 
their customers are able to access their online applications and perform actions such as 
searching, online shopping, checking an account balance, or simply researching. 


What for? 


By monitoring website and web application performance, IT Departments or Network 
Engineers hope to avoid downtime and keep their applications running. Commonly measured 
metrics are response time and availability (or uptime), although both consistency and reliability 
metrics are starting to gain popularity. 


Inside or outside monitoring 


Website monitoring can be done from both inside and outside of a corporate firewall. 
Traditional Network Management solutions focus on inside the firewall monitoring, whereas 
external performance monitoring will test and monitor performance issues across the Internet 
backbone and in some cases all the way to the end-user. External performance monitoring is also 
known as end-user monitoring or end-to-end performance monitoring. 

Real user monitoring measures the performance and availability experienced by actual users, 
diagnoses individual incidents, and tracks the impact of a change. 


What impact? 


If it's up...A.K.A...It's down... per year 

90% n/a 876 hours 

95% n/a 438 hours 

99% two 9's 87 hours, 36 minutes 

99.9% three 9's 8 hours, 45 minutes, 36 seconds 
99.99% four 9's 52 minutes, 33.6 seconds 
99.999% _ five 9's 5 minutes, 15.36 seconds 
99.9999% six 9's 31.68 seconds 


Various types of protocol 


Website monitoring service can check HTTP pages, HTTPS, FTP, SMTP, POP3, IMAP, 
DNS, SSH, Telnet, SSL, TCP, PING and a range of other ports with great variety of check 
intervals from every 4 hours to every one minute. Typically, most website monitoring services 
test your server anywhere between once-per hour to once-per-minute. 

Time performances: for example an HTTP page should answer in less than 1 second (for a 
download of 16Ko) to be considered as "good". 
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Various type of monitoring 


You may monitor a single page of your website, but you can also monitor a complete 
business process (often referred to as multi-step transactions). 


Servers around the globe 


Website monitoring services usually have a number of servers around the globe - USA, 
Europe, Asia, Australia and other locations. By having multiple servers in different geographic 
locations, monitoring service can determine if a Web server is available across different 
Networks worldwide. The more locations the better picture on your website availability. 


Types of website monitoring 


There are two main types of website monitoring 
e Synthetic monitoring also known as active monitoring, and 
e Passive monitoring also known as real monitoring. 


Notification options - alerts 
As the information brought by website monitoring services is in most cases urgent and may 


be of crucial importance, various notification methods, often known as "alerts" may be used: e- 
mail, regular and cell phones, SMS, fax, pagers, etc. 


See also 


e Web analytics 
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Website management 


A website (alternatively, Web site or web site) is a collection of Web pages, images, videos 
and other digital assets that is hosted on one or several Web server(s), usually accessible via the 
Internet, cell phone or a LAN. 

A Web page is a document, typically written in HTML, that is almost always accessible via 
HTTP, a protocol that transfers information from the Web server to display in the user's Web 
browser. 

All publicly accessible websites are seen collectively as constituting the "World Wide Web". 

The pages of websites can usually be accessed from a common root URL called the 
homepage, and usually reside on the same physical server. The URLs of the pages organize them 
into a hierarchy, although the hyperlinks between them control how the reader perceives the 
overall structure and how the traffic flows between the different parts of the sites. 

Some websites require a subscription to access some or all of their content. Examples of 
subscription sites include many business sites, parts of many news sites, academic journal sites, 
gaming sites, message boards, Web-based e-mail, services, social networking website, and sites 
providing real-time stock market data. 

As of March 2007 there are over 8 billion web pages in total on the World Wide Web. - 


History 


The first on-line website appeared in 1991. On 30 April 1993, CERN announced that the 
World Wide Web would be free to anyone.[1] A copy of the original first Web page, created by 
Tim Berners-Lee, is kept here. 


Overview 


Organized by function a website may be 
e apersonal website 
e abusiness website 
e agovernment website or 
e anon-government website 
e anon-profit organization website or blog 

It could be the work of an individual, a business or other organization and is typically 
dedicated to some particular topic or purpose. Any website can contain a hyperlink to any other 
website, so the distinction between individual sites, as perceived by the user, may sometimes be 
blurred. 

Websites are written in, or dynamically converted to, HTML (Hyper Text Markup Language) 
and are accessed using a software program called a Web browser, also known as a HTTP client. 
Web pages can be viewed or otherwise accessed from a range of computer-based and Internet- 
enabled devices of various sizes, including desktop computers, laptop computers, PDAs and cell 
phones. 
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A website is hosted on a computer system known as a web server, also called an HTTP 
server, and these terms can also refer to the software that runs on these system and that retrieves 
and delivers the Web pages in response to requests from the website users. Apache is the most 
commonly used Web server software (according to Netcraft statistics) and Microsoft's Internet 
Information Server (IIS) is also commonly used. 

A static website is one that has web pages stored on the server in the same form as the user 
will view them. They are edited using three broad categories of software: 

e Text editors. such as Notepad or TextEdit, where the HTML is manipulated 
directly within the editor program 

e WYSIWYG editors. such as Microsoft FrontPage and Macromedia Dreamweaver, 
where the site is edited using a GUI interface and the underlying HTML is 
generated automatically by the editor software 

e Template-based editors, such as Rapidweaver and iWeb, which allow users to 
quickly create and upload websites to a web server without having to know 
anything about HTML, as they just pick a suitable template from a palette and 
add pictures and text to it in a DTP-like fashion without ever having to see any 
HTML code. 

A dynamic website is one that has frequently changing information or collates information 
on the hop each time a page is requested. For example, it would call various bits of information 
from a database and put them together in a pre-defined format to present the reader with a 
coherent page. It interacts with users in a variety of ways including by reading cookies 
recognizing users' previous history, session variables, server side variables etc., or by using 
direct interaction (form elements, mouseovers, etc.). A site can display the current state of a 
dialogue between users, monitor a changing situation, or provide information in some way 
personalized to the requirements of the individual user. 

There is a wide range of software systems, such as Java Server Pages (JSP), the PHP and Perl 
programming languages, Active Server Pages (ASP) and ColdFusion (CFM) that are available to 
generate dynamic Web systems and dynamic sites. Sites may also include content that is 
retrieved from one or more databases or by using XML-based technologies such as RSS. 

Static content may also be dynamically generated either periodically, or if certain conditions 
for regeneration occur (cached) in order to avoid the performance loss of initiating the dynamic 
engine on a per-user or per-connection basis. 

Plugins are available to expand the features and abilities of Web browsers, which use them to 
show active content, such as Flash, Shockwave or applets written in Java. Dynamic HTML also 
provides for user interactivity and realtime element updating within Web pages (i.e., pages don't 
have to be loaded or reloaded to effect any changes), mainly using the DOM and JavaScript, 
support which is built-in to most modern Web browsers. 


Websites as businesses 


Turning a website into an income source is a common practice for web-developers and 
website owners. There are several methods for creating a website business which fall into two 
broad categories. 

1. Online Information Businesses 
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Some websites offer no products at all but provide free information with income coming 
from clicks the visitors make on advertisements (see contextual ads). There is a wide range of 
monetizing used on such sites and the sites themselves are actively traded and bought and sold as 
going concerns. 

Guides have been published which explain how to create such a business. See links at bottom 
of page. 

2. Online Shop Businesses 

While most business websites serve as a shop window for brick and mortar businesses it is 
increasingly the case that some websites are businesses in their own right. These websites are 
fully self-contained businesses entities offering, for example, immediate downloads of retail 
software on payment of the product's price via their shopping cart. 

Guides have been published which explain how to create such a business. See links at bottom 
of page. 


Spelling 


As noted above, there are several different spellings for this term. Although "website" and 
"web site" are commonly used (the former especially in British English), the Associated Press 
Stylebook, Reuters, Microsoft, academia, book publishing, The Chicago Manual of Style, and 
dictionaries such as Merriam-Webster use the two-word, initially capitalized spelling Web site. 
This is because "Web" is not a general term but a shortened form of World Wide Web. As with 
many newly created terms, it may take some time before a common spelling is finalized. (This 
controversy also applies to derivative terms such as "Web master"/"webmaster" and "Web 
cam"/"webcam"). 

The Canadian Oxford Dictionary and the Canadian Press Stylebook list "website" and "web 
page" as the preferred spellings. The Oxford English Dictionary began using "website" as its 
standardized form in 2004.”! 

Bill Walsh, the copy chief of The Washington Post's national desk, and one of American 
English’s foremost grammarians, argues for the two-word spelling with capital W in his books 
Lapsing into a Comma and The Elephants of Style, and on his site, the Slot. "! 


Way to Upload website and earn on click basis 


Here are some guidelines to make a website for earning pupose. 


Types of websites 


There are many varieties of Web sites, each specializing in a particular type of content or use, 
and they may be arbitrarily classified in any number of ways. A few such classifications might 
include: 

e Affiliate: enabled portal that renders not only its custom CMS but also 
syndicated content from other content providers for an agreed fee. There are 
usually three relationship tiers. Affiliate Agencies (e.g., Commission Junction), 
Advertisers (e.g., Ebay) and consumer (e.g., Yahoo). 

Archive site: used to preserve valuable electronic content threatened with 
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extinction. Two examples are: Internet Archive, which since 1996 has preserved 
billions of old (and new) Web pages; and Google Groups, which in early 2005 
was archiving over 845,000,000 messages posted to Usenet news/discussion 
groups. 

Blog (or web log) site: sites generally used to post online diaries which may 
include discussion forums (e.g., blogger, Xanga). 

Corporate website: used to provide background information about a business, 
organization, or service. 

Commerce site or eCommerce site: for purchasing goods, such as Amazon.com. 
Community site: a site where persons with similar interests communicate with 
each other, usually by chat or message boards, such as MySpace. 

Database site: a site whose main use is the search and display of a specific 
database's content such as the Internet Movie Database or the Political 
Graveyard. 

Development site: a site whose purpose is to provide information and 
resources related to software development, Web design and the like. 

Directory site: a site that contains varied contents which are divided into 
categories and subcategories, such as Yahoo! directory, Google directory and 
Open Directory Project. 

Download site: strictly used for downloading electronic content, such as 
software, game demos or computer wallpaper. 

Employment site: allows employers to post job requirements for a position or 
positions and prospective employees to fill an application. 

Erotica websites: shows sexual videos and images. 

Game site: a site that is itself a game or "playground" where many people come 
to play, such as MSN Games ,Pogo.com and Newgrounds.com. 

Geodomain refers to domain names that are the same as those of geographic 
entities, such as cities and countries. For example, Richmond.com is the 
geodomain for Richmond, Virginia. 

Gripe site: a site devoted to the critique of a person, place, corporation, 
government, or institution. 

Humor site: satirizes, parodies or otherwise exists solely to amuse. 
Information site: contains content that is intended to inform visitors, but not 
necessarily for commercial purposes, such as: RateMyProfessors.com, Free 
Internet Lexicon and Encyclopedia. Most government, educational and non- 
profit institutions have an informational site. 

Java applet site: contains software to run over the Web as a Web application. 
Mirror (computing) site: A complete reproduction of a website. 

News site: similar to an information site, but dedicated to dispensing news and 
commentary. 

Personal homepage: run by an individual or a small group (such as a family) 
that contains information or any content that the individual wishes to include. 
Phish site: a website created to fraudulently acquire sensitive information, such 
as passwords and credit card details, by masquerading as a trustworthy person 
or business (such as Social Security Administration, PayPal) in an electronic 
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communication. (see Phishing). 

Political site: A site on which people may voice political views. 

Pornography (porn) site: a site that shows pornographic images and videos. 
Rating site: A site on which people can praise or disparage what is featured (e.g. 
ratemyrack.com). 

Review site: A site on which people can post reviews for products or services. 
Search engine site: a site that provides general information and is intended as a 
gateway or lookup for other sites. A pure example is Google, and the most 
widely known extended type is Yahoo!. 

Shock site: includes images or other material that is intended to be offensive to 
most viewers (e.g. rotten.com). 

Warez: a site filled with illegal downloads. 

Web portal: a site that provides a starting point or a gateway to other resources 
on the Internet or an intranet. 

Wiki site: a site which users collaboratively edit (such as Wikipedia). 

Some websites may be included in one or more of these categories. For example, a business 
website may promote the business's products, but may also host informative documents, such as 
white papers. There are also numerous sub-categories to the ones listed above. For example, a 
porn site is a specific type of eCommerce site or business site (that is, it is trying to sell 
memberships for access to its site). A fan site may be a dedication from the owner to a particular 
celebrity. 

Websites are constrained by architectural limits (e.g., the computing power dedicated to the 
website). Very large websites, such as Yahoo!, Microsoft, and Google employ many servers and 
load balancing equipment such as Cisco Content Services Switches to distribute visitor loads 
over multiple computers at multiple locations. 

In January of 2007, Netcraft, an Internet monitoring company that has tracked Web growth 
since 1995, reported that there were 106,875,138 Web sites with domain names and content on 
them in 2007, compared to just 18,000 Web sites in August 1995. 


Prizes 


The Webby Awards are a set of awards presented to the world's "best" websites, a concept 
pioneered by Best of the Web in 1994. 


See also 


List of content management systems 
Macromedia Dreamweaver 
Microsoft FrontPage 

Web design 

Web development 

Web analytics 

Web application 

Web hosting 
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Website monitoring 

Website architecture 

Web service 

Web template 

World Wide Web Consortium (Web standards) 


Web hosting service | Web producer | Web server | Webmaster 


Web hosting service 


Types of Internet hosting service 


A web hosting service is a type of Internet hosting service that allows individuals and 
organizations to provide their own websites accessible via the World Wide Web. Web hosts are 
companies that provide space on a server they own for use by their clients as well as providing 
Internet connectivity, typically in a data center. Webhosts can also provide data center space and 
connectivity to the Internet for servers they do not own to be located in their data center, called 
colocation. 


Full-featured hosting services 
o Virtual private server 

o Dedicated hosting 

o Colocation centre 


Web hosting 

o Free hosting 

o Shared hosting 

o Clustered hosting 

o Reseller hosting 

o Application-specific 


" Blog hosting 

" Guild hosting 

= Image hosting 

« Video hosting 

# Wiki farms 
File hosting 
Remote backup service 
Game server hosting 
DNS hosting 
E-mail hosting 


Service scope 


The scopes of hosting services vary widely. The most basic is webpage and small-scale file 
hosting, where files can be uploaded via File Transfer Protocol (FTP) or a Web interface. The 
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files are usually delivered to the Web "as is" or with little processing. Many Internet service 
providers (ISPs) offer this service for free to their subscribers. People can also obtain Web page 
hosting from other, alternative service providers. Personal web site hosting is typically free, 
advertisement-sponsored, or cheap. Business web site hosting often has a higher expense. 

Single page hosting is generally sufficient only for personal web pages. A complex site calls 
for a more comprehensive package that provides database support and application development 
platforms (e.g. PHP, Java, and ASP.NET). These facilities allow the customers to write or install 
scripts for applications like forums and content management. For e-commerce, SSL is also 
required. 

The host may also provide an interface control panel (e.g. cPanel, Plesk or others) for 
managing the Web server and installing scripts as well as other services like e-mail. Recently, 
Web.com holds many patents it claims cover broad methods for website building and web 
control panels. Hostopia, a large wholesale host, recently purchased a license to use that 
technology from web.com for 10% of retail revenues[1]. Web.com recently sued Go Daddy as 
well for similar patent infringement "! 

Some hosts specialize in certain software or services (e.g. e-commerce). They are commonly 
used by larger companies to outsource network infrastructure to a hosting company. To find a 
web hosting company, there are searchable directories that can be used. One must be extremely 
careful when searching for a new company due to the fact that many of the people promoting 
service providers are actually affiliates and the reviews are biased. 


Hosting reliability and uptime 


Hosting uptime refers to the percentage of time the host is accessible via the internet. Many 
providers state that they aim for a 99.9% uptime, but there may be server restarts and planned (or 
unplanned) maintenance in any hosting environment. 

A common claim from the popular hosting providers is '99% or 99.9% server uptime’ but this 
often refers only to a server being powered on and doesn't account for network downtime. Real 
downtime can potentially be larger than the percentage guaranteed by the provider. Many 
providers tie uptime, and accessibility, into their own Service Level Agreement, or SLA. SLAs 
may or may not include refunds, or reduced costs if performance goals are not met. One must be 
extremely careful when selecting a new company and they should read all terms and conditions 
carefully. A potential customer should also check out the webhosting company's acceptable use 
policy (AUP) in order to avoid potential cancellation of services due to activities that are 
considered a violation. 


Types of hosting 


Internet hosting services can run Web servers. 
Hosting services limited to the Web: 
e Free web hosting service: is free, (sometimes) advertisement-supported web 
hosting, and is often limited when compared to paid hosting. 
e Shared web hosting service: one's Web site is placed on the same server as 
many other sites, ranging from a few to hundreds or thousands. Typically, all 
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domains may share a common pool of server resources, such as RAM and the 
CPU. A shared website may be hosted with a reseller. 

e Reseller web hosting: allows clients to become web hosts themselves. 
Resellers could function, for individual domains, under any combination of 
these listed types of hosting, depending on who they are affiliated with as a 
provider. Resellers' accounts may vary tremendously in size: they may have 
their own virtual dedicated server to a colocated server. 

e Virtual Dedicated Server: slicing up a server into virtual servers. each user 
feels like they're on their own dedicated server, but they're actually sharing a 
server with many other users. The users may have root access to their own 
virtual space. 

e Dedicated hosting service: the user gets his or her own Web server and gains 
full control over it (root access for Linux/administrator access for Windows); 
however, the user typically does not own the server. Another type of Dedicated 
hosting is Self-Managed or Unmanaged. This is usually the least expensive for 
Dedicated plans. The user has full administrative access to the box, which 
means the client is responsible for the security and maintenance of his own 
dedicated box. 

e Managed hosting service: the user gets his or her own Web server but is not 
allowed full control over it (root access for Linux/administrator access for 
Windows); however, they are allowed to manage their data via FTP or other 
remote management tools. The user is disallowed full control so that the 
provider can guarantee quality of service by not allowing the user to modify the 
server or potentially create configuration problems. The user typically does not 
own the server. The server is leased to the client. 

e Colocation web hosting service: similar to the dedicated web hosting service, 
but the user owns the colo server; the hosting company provides physical space 
that the server takes up and takes care of the server. This is the most powerful 
and expensive type of the web hosting service. In most cases, the colocation 
provider may provide little to no support directly for their client's machine, 
providing only the electrical, Internet access, and storage facilities for the 
server. In most cases for colo, the client would have his own administrator visit 
the data center on site to do any hardware upgrades or changes. 

e Clustered hosting: having multiple servers hosting the same content for better 
resource utilization. 

Some specific types of hosting provided by web host service providers: 

e File hosting service: hosts not web pages but files 
Image hosting service 
Video hosting service 
Blog hosting service 
One-click hosting 
Shopping cart software 


Obtaining hosting 
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Web hosting is often provided as part of a general Internet access plan; there are many free 
and paid providers offering these services. 

A customer needs to evaluate the requirements of the application to choose what kind of 
hosting to use. Such considerations include database server software, scripting software, and 
operating system. Most hosting providers provide Linux-based web hosting which offers a wide 
range of different software. A typical configuration for a Linux server is the LAMP platform: 
Linux, Apache, MySQL, and PHP/Perl/Python. The webhosting client may want to have other 
services, such as email for their business domain, databases or multi-media services for 
streaming media. A customer may also choose Windows for its hosting platform. The customer 
still can choose from PHP, Perl, and Python but may also use ASP .Net or Classic ASP. 

Web hosting packages often include a Web Content Management System, so the end-user 
doesn't have to worry about the more technical aspects. 

One can also use Google to find active webhosting message boards that may provide 
feedback on what type of webhosting company may suit his/her needs. 


History of web hosting 


History of web hosting 


The history of web hosting. Before 1991 web hosting as you know it today did not exist. 

Then again, you probably wouldn’t recognize the internet before 1991, either. But to look at 
the history of website hosting, you have to look at the history of the internet itself. The internet 
may be the greatest media advancement since radio and television, but the internet as we know it 
today is powered by nearly 50 million websites forming its central nervous system. 

Without websites, where could you go when you when you went online? 

The Early Days The original concept of the internet has been attributed to J.C.R. Licklider in 
August of 1962 at MIT. Licklider wrote a series of articles where he envisioned a “Galactic 
Network” concept based on the idea of a series of globally interconnected computers where 
resources and information could be accessed from any site. 

Sound familiar? 

Licklider was soon to head up ARPA (Advanced Research Projects Agency), the Computer 
Sciences program at MIT. There he would convince his successors the importance of his ideas 
about computer networking. 

Interestingly enough, ARPA was developed by the military at the same time as NASA to find 
a way for the Americans to catch up to the U.S.S.R. in the space race after the launch of Sputnik. 
The early work on computer networking revolved around a concept known as “packet 
switching”, based on the idea that network data could be sent through phone lines as tiny 
packages instead of the traditional solid circuit lines of the day. 

That way, the connections would only be used as long as there were packets of information 
running through them, freeing up space on the line or “bandwidth” for more computer activity. 
Later, while working on a way to allow telecommunications systems survive a nuclear war, Paul 
Baran would develop the actual “Hot Potato” design of networking that would lay the foundation 
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for what would one day be the internet. The next step was to get the computers to actually talk to 
each other. 

Is There Anybody Out There? In 1965, Lawrence G. Roberts and Thomas Merril connected 
the TX-2 computer located in Massachusetts with the Q-32 computer in California via a dial-up 
telephone line supplied by AT&T and the first computer network was born. However, the 
computers were agonizingly slow, communicating at a steady 2.4 kbps. 

The lack of speed convinced Roberts and Merril that the solid circuit-switching of the AT&T 
phone system was terribly inadequate and Baran’s packet switching method was the only way to 
go. Amazingly, it was soon discovered that the same work on packet theory had been taking 
place in three separate places simultaneously without any of the researchers aware of each other. 

In 1966 Roberts unveiled his plans for “ARPANET”, the first wide-area network ever 
developed. In 1969, those ideas turned into a reality when they successfully linked computers at 
UCLA, The Stanford Research Institute, The University of California Santa Barbara, and the 
University of Utah. Each computer was a “host” or node in the connection, making them all able 
to interact with one another. Over the next 2 years, they would add 19 more hosts and 13 nodes 
to their little network. 

The internet was a healthy baby, but it still had a lot of growing to do. 

You’ve Got Mail! In 1971 Roy Tomlinson wrote the first basic e-mail program, and it was 
quickly broadened by Lawrence G. Roberts. With this, researchers could finally send and receive 
messages over their network. 

This would prove to be the biggest development in the internet’s short history; e-mail use has 
become the backbone of internet communications and is used by hundreds of millions of people 
every day to connect with each other. 

When the researchers integrated the popular program into ARPANET, they made several 
design modifications before deciding on the “@” symbol for e-mail addresses. 

The ‘70s also saw the birth of TELENET, the first commercial version of an internet 
provider, as well as several other networks. Also, TCP was officially split into TCP/IP in an 
attempt to unify all of the budding networks that were springing up in North America and around 
the globe. 

TCP/IP stands for Transmissions Control Protocol and Internet Protocol. TCP is the host to 
host connection used by computers and IP passes the individual packets of information between 
computers. 

The internet was growing, but it was still a very different animal than what we know today. 

Coming of Age The eighties saw rapid growth and development in the computer sciences 
field. Specifically, the TCP/IP format was first used to tie the ARPANET system to several other 
networks. The format allowed the networks to access each other while operating individually. 
Officially, it was the first definition of the term “Internet”, meaning a series of networks linked 
together by the TCP/IP format. 

With all of these new networks and the growth of the old networks, it became necessary for 
scientists to be able to disseminate between the various sources and institutions. In 1984 the 
introduction of the Domain Name System, or DNS, became a standard for computers to be able 
to differentiate themselves from one another. Six domains were introduced: edu (Education), gov 
(Government), mil (Military), com (Commercial), net (Network Resources), and org 
(Organization). On March 15, 1985, Symbolics.com became the first registered domain name. 


450 


NICOLAE SFETCU: WEB DESIGN & DEVELOPMENT 


Welcome to the World Wide Web 1991 was an important year in the development of the 
internet. Already an entity in its own right, it was about to get a lot bigger. It started with the 
National Science Foundation (NSF) when they decided it was time to lift commercial restrictions 
on the web. This in turn opened the internet up to limitless commercial possibilities. Electronic 
commerce was born, and with it came companies who were starting to think there might be a 
future in website hosting services. 

Later that year, the folks at CERN unleashed the World Wide Web (www) onto the world, 
which incorporated Tim Berner-Lee’s new HTML computer Language. HTML stands for 
HyperText Markup Language, and uses specifications for Uniform Resource Locators URLs). 

Aside from giving the world a mouthful of new abbreviations to memorize, it also became 
the universal standard for locating website addresses. 

The internet was no longer simply a playground for universities and computer enthusiasts. 
With each new addition to its format, it became easier to use and easier to explain. At the same 
time, it grew in complexity. The business world saw the potential of the medium and seized on 
their chance. 

Website hosting, once expensive and complicated, is now cheap and only somewhat 
complicated. It began with large companies renting out extra space on their servers and has now 
become big business in itself. There are at least as many companies that offer web hosting as 
there are companies that provide internet service. 

As computers continue to evolve, the internet itself evolves. And with each new change come 
new changes to the way the business of website hosting is packaged to potential customers. 


Web producer 


A web producer is a career classification that describes someone who oversees the 
production and day to day operations of a web site. 

Web Producers must be familiar with all aspects of web site creation including: project 
scoping, information architecture, web design, web development and web site analytics. 
Although they may not directly create all of these aspects of a project themselves, Web 
Producers manage communication between the team members who do, and therefore must 
understand what is required of each team member. 

Web Producers must also be able to create content for the site. Content may be copy, 
photography or video. As such, Web Producers usually come from online Journalism or web 
Content Coordinator backgrounds. 

Web Producers also are responsible for balancing user interest with business interests on 
their sites. 


Web server 


The term Web server can mean one of two things: 

1. A computer program that is responsible for accepting HTTP requests from 
clients, which are known as Web browsers, and serving them HTTP responses 
along with optional data contents, which usually are Web pages such as HTML 
documents and linked objects (images, etc.). 
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2. A computer that runs a computer program which provides the functionality 
described in the first sense of the term. 


Common features 


Although Web server programs differ in detail, they all share some basic common features. 

1. HTTP: every Web server program operates by accepting HTTP requests from 
the network, and providing an HTTP response to the requester. The HTTP 
response typically consists of an HTML document, but can also be a raw text file, 
an image, or some other type of document (defined by MIME-types); if 
something bad is found in client request or while trying to serve the request, a 
Web server has to send an error response which may include some custom 
HTML or text messages to better explain the problem to end users. 

2. Logging: usually Web servers have also the capability of logging some detailed 
information, about client requests and server responses, to log files; this allows 
the Webmaster to collect statistics by running log analyzers on log files. 

In practice many Web servers implement the following features too. 

1. Authentication, optional authorization request (request of user name and 
password) before allowing access to some or all kind of resources. 

2. Handling of not only static content (file content recorded in server's 
filesystem(s)) but of dynamic content too by supporting one or more related 
interfaces (SSI, CGI, SCGI, FastCGI, JSP, PHP, ASP, ASP .NET, Server API such as 
NSAPI, ISAPI, etc.). 

3. HTTPS support (by SSL or TLS) to allow secure (encrypted) connections to the 
server on the standard port 443 instead of usual port 80. 

4. Content compression (i.e. by gzip encoding) to reduce the size of the 
responses (to lower bandwidth usage, etc.). 

5. Virtual Hosting to serve many web sites using one IP address. 

6. Large file support to be able to serve files whose size is greater than 2 GB on 
32 bit OS. 

7. Bandwidth throttling to limit the speed of responses in order to not saturate 
the network and to be able to serve more clients. 


Origin of returned content 


The origin of the content sent by server is called: 
e static if it comes from an existing file lying on a filesystem; 
e dynamic if it is dynamically generated by some other program or script or API 
called by the Web server. 
Serving static content is usually much faster (from 2 to 100 times) than serving dynamic 
content, especially if the latter involves data pulled from a database. 


Path translation 


Web servers are able to map the path component of a Uniform Resource Locator (URL) into: 
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e alocal file system resource (for static requests); 
e an internal or external program name (for dynamic requests). 
For a static request the URL path specified by the client is relative to the Web server's root 
directory. 
Consider the following URL as it would be requested by a client: 


http://www.example.com/path/file.htm! 


The client's Web browser will translate it into a connection to www.example.com with the 
following HTTP 1.1 request: 


GET /path/file.htm! HTTP/1.1 Host: www.example.com 


The Web server on www.example.com will append the given path to the path of its root 
directory. On Unix machines, this is commonly /var/www/htdocs. The result is the local file 
system resource: 


/var/www/htdocs/path/file.html 


The Web server will then read the file, if it exists, and send a response to the client's Web 
browser. The response will describe the content of the file and contain the file itself. 


Performances 


Web servers (programs) are supposed to serve requests quickly from more than one TCP/IP 
connection at a time. 

Main key performance parameters (measured under a varying load of clients and requests 
per client), are: 

e number of requests per second (depending on the type of request, etc.); 

e latency response time in milliseconds for each new connection or request; 

e throughput in bytes per second (depending on file size, cached or not cached 
content, available network bandwidth, etc.). 

Above three parameters vary noticeably depending on the number of active connections, so a 
fourth parameter is the concurrency level supported by a Web server under a specific 
configuration. 

Last but not least, the specific server model used to implement a Web server program can 
bias the performance and scalability level that can be reached under heavy load or when using 
high end hardware (many CPUs, disks, etc.). 


Load limits 
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A web server (program) has defined load limits, because it can handle only a limited number 
of concurrent client connections (usually between 2 and 60,000, by default between 500 and 
1,000) per IP address (and IP port) and it can serve only a certain maximum number of requests 
per second depending on: 


When 


its own settings; 

the HTTP request type; 

content origin (static or dynamic); 

the fact that the served content is or is not cached; 

the hardware and software limits of the OS where it is working. 
a web server is near to or over its limits, it becomes overloaded and thus 


unresponsive. 


Overload causes 


At any time Web servers can be overloaded because of: 


too much legitimate Web traffic (i.e. thousands or even millions of clients 
hitting the Web site in a short interval of time); 

DDoS (Distributed Denial of Service) attacks; 

Computer worms that sometimes cause abnormal traffic because of millions of 
infected computers (not coordinated among them); 

XSS viruses can cause high traffic because of millions of infected browsers 
and/or web servers; 

Internet web robots traffic not filtered / limited on large web sites with very 
few resources (bandwidth, etc.); 

Internet (network) slowdowns, so that client requests are served more slowly 
and the number of connections increases so much that server limits are 
reached; 

Web servers (computers) partial unavailability, this can happen because of 
required / urgent maintenance or upgrade, HW or SW failures, back-end (i.e. 
DB) failures, etc.; in these cases the remaining web servers get too much traffic 
and of course they become overloaded. 


Overload symptoms 


The symptoms of an overloaded Web server are: 


requests are served with (possibly long) delays (from 1 second to a few 
hundred seconds); 

500, 502, 503, 504 HTTP errors are returned to clients (sometimes also 
unrelated 404 error or even 408 error may be returned); 

TCP connections are refused or reset (interrupted) before any content is sent to 
clients; 

in very rare cases, only partial contents are sent (but this behaviour may well be 
considered a bug, even if it usually depends on unavailable system resources). 
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Anti-overload techniques 


To partially overcome above load limits and to prevent the overload scenario, most popular 
Web sites use common techniques like: 
e managing network traffic, by using: 
o Firewalls to block unwanted traffic coming from bad IP sources or having 
bad patterns; 
o HTTP traffic managers to drop, redirect or rewrite requests having bad 
HTTP patterns; 
o Bandwidth management and Traffic shaping, in order to smooth down peaks 
in network usage; 
e deploying Web cache techniques; 
e using different domain names to serve different (static and dynamic) content by 
separate Web servers, i.e.: 


co http://images.example.com 


co http://www.example.com 


e using different domain names and / or computers to separate big files from 
small and medium sized files; the idea is to be able to fully cache small and 
medium sized files and to efficiently serve big or huge (over 10 - 1000 MB) files 
by using different settings; 

e using many Web servers (programs) per computer, each one bound to its own 
network card and IP address; 

e using many Web servers (computers) that are grouped together so that they act 
or are seen as one big Web server, see also: Load balancer; 

e adding more HW resources (i.e. RAM, disks) to each computer; 

e tuning OS parameters for HW capabilities and usage; 

e using more efficient computer programs for Web servers, etc.; 

e using other workarounds, specially if dynamic content is involved. 


Historical notes 


In 1989 Tim Berners-Lee proposed to his employer CERN (European Organization for 
Nuclear Research) a new project, which had the goal of easing the exchange of information 
between scientists by using a hypertext system. As a result of the implementation of this project, 
in 1990 Berners-Lee wrote two programs: 

e a browser called WorldWideWeb; 
e the world's first Web server, which ran on NeXTSTEP; NOTE: today, this 
machine is on exhibition at CERN's public museum, Microcosm. 

The first web server in U.S.A. was installed on December 12, 1991 at SLAC [1] 
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Between 1991 and 1994 the simplicity and effectiveness of early technologies used to surf 
and exchange data through the WorldWideWeb helped a lot to: 
e port them to many different OSs; 
e spread their use among lots of different social groups of people, first in scientific 
organizations, then in universities and finally in industry. 

In 1994 Tim Berners-Lee decided to constitute the World Wide Web Consortium to regulate 
the further development of the many technologies involved (HTTP, HTML, etc.) through a 
standardization process. 

The following years are recent history which has seen an exponential growth (become 
explosive after 2000) of the number of web sites and, of course, of the number of Web Servers. 


Software 


As of July 2007, the most common HTTP serving programs are: 
e Apache HTTP Server 


Microsoftt 
Sunt 
lighttpd 
*Microsoft is the sum of sites running Microsoft-Internet-Information-Server, Microsoft-IIS, Microsoft-IIS-W, Microsoft- 
PWS-95, & Microsoft-PWS. 


Sun is the sum of sites running SunONE, iPlanet-Enterprise, Netscape-Enterprise, Netscape-FastTrack, Netscape-Commerce, 
Netscape-Communications, Netsite-Commerce & Netsite-Communications. 

There are thousands of different Web server programs available, many of which are 
specialized for very specific purposes, so the fact that a web server is not very popular does not 
necessarily mean that it has a lot of bugs or poor performance. 


Statistics 


The most popular Web servers, used for public Web sites, are tracked by Netcraft Web 
Server Survey, with details given by Netcraft Web Server Reports. 

According to this site, Apache has been the most popular Web server on the Internet since 
April of 1996. The January 2007 Netcraft Web Server Survey found that about 60% of the Web 
sites on the Internet were using Apache, followed by IIS with about 30% share. 

Another site providing statistics is SecuritySpace, which also provides a detailed breakdown 
for each version of Web server: 


See also 


e Web hosting service 


Webmaster 
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The webmaster (feminine: webmistress), also called the system administrator, the author, 
or the website administrator, is the person responsible for designing, developing, marketing, or 
maintaining a website. 


Webmaster as a profession 


Webmasters are practitioners of Web communication. Typically, they are generalists with 
HTML expertise who manage all aspects of Web operations. On a smaller site, the webmaster 
will typically be the owner, developer and/or programmer, in addition to the author of the 
content. 

On larger sites, the webmaster will act as a coordinator and overseer to the activities of other 
people working on the site and is usually an employee of the owner of the Web site, hence 
webmaster can also be listed as an occupation. If the webmaster is hired by a larger Web site, or 
promoted to the position, they could do things from web design, to project management, or 
employee supervision. The range of activities to be undertaken by a webmaster may be estimated 
using the concept of Website scale. 

In the early days of the use of the term "webmaster" (a take-off on the term "postmaster", the 
administrator of an e-mail system), this role encompassed all aspects of planning, coding, 
production, and user interface. The webmaster may have many of the duties of an information 
architect, including ensuring site usability, user experience and menu taxonomy. 

However, since the late 90s, this type of webmaster role was typically only found working on 
small Web sites that could be managed by one person, or in environments where there was not a 
great deal of role definition. The current model tends to be more team oriented with a website 
manager or online producer leading a team consisting of web developers, designers, 
programmers, QA lead, Adobe Flash developers and often at least one usability expert or a 
UI/UE team. In established web development companies, especially those existing since the 90s, 
the term webmaster may be used by senior officers of the company, and may include usage such 
as "Webmaster-in-chief." 

A broader definition of webmaster is a businessperson who uses online media to sell products 
and/or services. This broader definition of webmaster covers not just the technical aspects of 
overseeing Web site construction and maintenance but also management of content, advertising, 
marketing and order fulfillment for the Web site. 

Core responsibilities of the webmaster include the regulation and management of access 
rights of different users of a web site, the appearance and setting up web site navigation. Content 
placement can be part of a webmaster's responsibilities, while content creation is typically not. 

Typically, the webmaster is the agent who reads user feedback and complaints about site 
functionality. 


See also 


e Web design 
e Online producer 
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License 


Copyright © version 1.0 2006 by MultiMedia and Nicolae Sfetcu. Permission is granted to 
copy, distribute and/or modify this document under the terms of the GNU Free Documentation 
License, Version 1.2 or any later version published by the Free Software Foundation; with no 
Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts. A copy of the 
license can be found in this page, as well as at the GNU Free Documentation License. 

This book, in all its versions (also those modified from third parties in italian, english or 
whichever other language), for will of the authors, may be reproduced also integrally without 
violating any law in as much as this book is released under the GNU Free Documentation 


License. 


This book: 


May be modified partially or integrally creating manuals for companies, 
agencies or persons who deal with formatting, changing either the diagram or 
the contents or the pagination. 

May be distributed either in its original or in modified form, or either in 
electronic or in paper format from either field periodicals or not, Internet sites 
and whichever other medium. 

May be used as internal manual by companies, public or private agencies, or 
universities . 

May be used distributed by universities as a hand-out. 

May even be resold without having to recognize any type of royalty to the 
authors on the condition that the purchasers be granted the freedom of making 
even integral copies, redistribute or resell them. 


GNU Free Documentation License 


GNU Free Documentation License 


GNU Free Documentation License 


Version 1.2, November 2002 


Copyright (C) 2000,2001,2002 Free Software Foundation, Inc. 51 Franklin St, Fifth Floor, Boston, 
MA 02110-1301 USA Everyone is permitted to copy and distribute verbatim copies of this 
license document, but changing it is not allowed. 


0. PREAMBLE 
The purpose of this License is to make a manual, textbook, or other functional and useful 
document "free" in the sense of freedom: to assure everyone the effective freedom to copy and 
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redistribute it, with or without modifying it, either commercially or noncommercially. 
Secondarily, this License preserves for the author and publisher a way to get credit for their 
work, while not being considered responsible for modifications made by others. 

This License is a kind of "copyleft", which means that derivative works of the document 
must themselves be free in the same sense. It complements the GNU General Public License, 
which is a copyleft license designed for free software. 

We have designed this License in order to use it for manuals for free software, because free 
software needs free documentation: a free program should come with manuals providing the 
same freedoms that the software does. But this License is not limited to software manuals; it can 
be used for any textual work, regardless of subject matter or whether it is published as a printed 
book. We recommend this License principally for works whose purpose is instruction or 
reference. 

1. APPLICABILITY AND DEFINITIONS 

This License applies to any manual or other work, in any medium, that contains a notice 
placed by the copyright holder saying it can be distributed under the terms of this License. Such 
a notice grants a world-wide, royalty-free license, unlimited in duration, to use that work under 
the conditions stated herein. The "Document", below, refers to any such manual or work. Any 
member of the public is a licensee, and is addressed as "you". You accept the license if you copy, 
modify or distribute the work in a way requiring permission under copyright law. 

A "Modified Version" of the Document means any work containing the Document or a 
portion of it, either copied verbatim, or with modifications and/or translated into another 
language. 

A "Secondary Section" is a named appendix or a front-matter section of the Document that 
deals exclusively with the relationship of the publishers or authors of the Document to the 
Document's overall subject (or to related matters) and contains nothing that could fall directly 
within that overall subject. (Thus, if the Document is in part a textbook of mathematics, a 
Secondary Section may not explain any mathematics.) The relationship could be a matter of 
historical connection with the subject or with related matters, or of legal, commercial, 
philosophical, ethical or political position regarding them. 

The "Invariant Sections" are certain Secondary Sections whose titles are designated, as being 
those of Invariant Sections, in the notice that says that the Document is released under this 
License. If a section does not fit the above definition of Secondary then it is not allowed to be 
designated as Invariant. The Document may contain zero Invariant Sections. If the Document 
does not identify any Invariant Sections then there are none. 

The "Cover Texts" are certain short passages of text that are listed, as Front-Cover Texts or 
Back-Cover Texts, in the notice that says that the Document is released under this License. A 
Front-Cover Text may be at most 5 words, and a Back-Cover Text may be at most 25 words. 

A "Transparent" copy of the Document means a machine-readable copy, represented in a 
format whose specification is available to the general public, that is suitable for revising the 
document straightforwardly with generic text editors or (for images composed of pixels) generic 
paint programs or (for drawings) some widely available drawing editor, and that is suitable for 
input to text formatters or for automatic translation to a variety of formats suitable for input to 
text formatters. A copy made in an otherwise Transparent file format whose markup, or absence 
of markup, has been arranged to thwart or discourage subsequent modification by readers is not 
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Transparent. An image format is not Transparent if used for any substantial amount of text. A 
copy that is not "Transparent" is called "Opaque". 

Examples of suitable formats for Transparent copies include plain ASCII without markup, 
Texinfo input format, LaTeX input format, SGML or XML using a publicly available DTD, and 
standard-conforming simple HTML, PostScript or PDF designed for human modification. 
Examples of transparent image formats include PNG, XCF and JPG. Opaque formats include 
proprietary formats that can be read and edited only by proprietary word processors, SGML or 
XML for which the DTD and/or processing tools are not generally available, and the machine- 
generated HTML, PostScript or PDF produced by some word processors for output purposes 
only. 

The "Title Page" means, for a printed book, the title page itself, plus such following pages as 
are needed to hold, legibly, the material this License requires to appear in the title page. For 
works in formats which do not have any title page as such, "Title Page" means the text near the 
most prominent appearance of the work's title, preceding the beginning of the body of the text. 

A section "Entitled XYZ" means a named subunit of the Document whose title either is 
precisely XYZ or contains XYZ in parentheses following text that translates XYZ in another 
language. (Here XYZ stands for a specific section name mentioned below, such as 
"Acknowledgements", "Dedications", "Endorsements", or "History".) To "Preserve the Title” of 
such a section when you modify the Document means that it remains a section "Entitled XYZ" 
according to this definition. 

The Document may include Warranty Disclaimers next to the notice which states that this 
License applies to the Document. These Warranty Disclaimers are considered to be included by 
reference in this License, but only as regards disclaiming warranties: any other implication that 
these Warranty Disclaimers may have is void and has no effect on the meaning of this License. 

2. VERBATIM COPYING 

You may copy and distribute the Document in any medium, either commercially or 
noncommercially, provided that this License, the copyright notices, and the license notice saying 
this License applies to the Document are reproduced in all copies, and that you add no other 
conditions whatsoever to those of this License. You may not use technical measures to obstruct 
or control the reading or further copying of the copies you make or distribute. However, you may 
accept compensation in exchange for copies. If you distribute a large enough number of copies 
you must also follow the conditions in section 3. 

You may also lend copies, under the same conditions stated above, and you may publicly 
display copies. 

3. COPYING IN QUANTITY 

If you publish printed copies (or copies in media that commonly have printed covers) of the 
Document, numbering more than 100, and the Document's license notice requires Cover Texts, 
you must enclose the copies in covers that carry, clearly and legibly, all these Cover Texts: 
Front-Cover Texts on the front cover, and Back-Cover Texts on the back cover. Both covers 
must also clearly and legibly identify you as the publisher of these copies. The front cover must 
present the full title with all words of the title equally prominent and visible. You may add other 
material on the covers in addition. Copying with changes limited to the covers, as long as they 
preserve the title of the Document and satisfy these conditions, can be treated as verbatim 
copying in other respects. 
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If the required texts for either cover are too voluminous to fit legibly, you should put the first 
ones listed (as many as fit reasonably) on the actual cover, and continue the rest onto adjacent 
pages. 

If you publish or distribute Opaque copies of the Document numbering more than 100, you 
must either include a machine-readable Transparent copy along with each Opaque copy, or state 
in or with each Opaque copy a computer-network location from which the general network-using 
public has access to download using public-standard network protocols a complete Transparent 
copy of the Document, free of added material. If you use the latter option, you must take 
reasonably prudent steps, when you begin distribution of Opaque copies in quantity, to ensure 
that this Transparent copy will remain thus accessible at the stated location until at least one year 
after the last time you distribute an Opaque copy (directly or through your agents or retailers) of 
that edition to the public. 

It is requested, but not required, that you contact the authors of the Document well before 
redistributing any large number of copies, to give them a chance to provide you with an updated 
version of the Document. 

4. MODIFICATIONS 

You may copy and distribute a Modified Version of the Document under the conditions of 
sections 2 and 3 above, provided that you release the Modified Version under precisely this 
License, with the Modified Version filling the role of the Document, thus licensing distribution 
and modification of the Modified Version to whoever possesses a copy of it. In addition, you 
must do these things in the Modified Version: 


e A. Use in the Title Page (and on the covers, if any) a title distinct from that of the 
Document, and from those of previous versions (which should, if there were 
any, be listed in the History section of the Document). You may use the same 
title as a previous version if the original publisher of that version gives 
permission. 

e B. List on the Title Page, as authors, one or more persons or entities responsible 
for authorship of the modifications in the Modified Version, together with at 
least five of the principal authors of the Document (all of its principal authors, if 
it has fewer than five), unless they release you from this requirement. 

e C. State on the Title page the name of the publisher of the Modified Version, as 
the publisher. 

e D. Preserve all the copyright notices of the Document. 

e EE. Add an appropriate copyright notice for your modifications adjacent to the 
other copyright notices. 

e F. Include, immediately after the copyright notices, a license notice giving the 
public permission to use the Modified Version under the terms of this License, 
in the form shown in the Addendum below. 

e G. Preserve in that license notice the full lists of Invariant Sections and required 
Cover Texts given in the Document's license notice. 

e H. Include an unaltered copy of this License. 

e I. Preserve the section Entitled "History", Preserve its Title, and add to it an item 
stating at least the title, year, new authors, and publisher of the Modified 
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Version as given on the Title Page. If there is no section Entitled "History" in the 
Document, create one stating the title, year, authors, and publisher of the 
Document as given on its Title Page, then add an item describing the Modified 
Version as stated in the previous sentence. 

e J. Preserve the network location, if any, given in the Document for public access 
to a Transparent copy of the Document, and likewise the network locations 
given in the Document for previous versions it was based on. These may be 
placed in the "History" section. You may omit a network location for a work that 
was published at least four years before the Document itself, or if the original 
publisher of the version it refers to gives permission. 

eK. For any section Entitled "Acknowledgements" or "Dedications", Preserve the 
Title of the section, and preserve in the section all the substance and tone of 
each of the contributor acknowledgements and/or dedications given therein. 

e L. Preserve all the Invariant Sections of the Document, unaltered in their text 
and in their titles. Section numbers or the equivalent are not considered part of 
the section titles. 

e M. Delete any section Entitled "Endorsements". Such a section may not be 
included in the Modified Version. 

e N. Do not retitle any existing section to be Entitled "Endorsements" or to 
conflict in title with any Invariant Section. 

e 0. Preserve any Warranty Disclaimers. 

If the Modified Version includes new front-matter sections or appendices that qualify as 
Secondary Sections and contain no material copied from the Document, you may at your option 
designate some or all of these sections as invariant. To do this, add their titles to the list of 
Invariant Sections in the Modified Version's license notice. These titles must be distinct from any 
other section titles. 

You may add a section Entitled "Endorsements", provided it contains nothing but 
endorsements of your Modified Version by various parties--for example, statements of peer 
review or that the text has been approved by an organization as the authoritative definition of a 
standard. 

You may add a passage of up to five words as a Front-Cover Text, and a passage of up to 25 
words as a Back-Cover Text, to the end of the list of Cover Texts in the Modified Version. Only 
one passage of Front-Cover Text and one of Back-Cover Text may be added by (or through 
arrangements made by) any one entity. If the Document already includes a cover text for the 
same cover, previously added by you or by arrangement made by the same entity you are acting 
on behalf of, you may not add another; but you may replace the old one, on explicit permission 
from the previous publisher that added the old one. 

The author(s) and publisher(s) of the Document do not by this License give permission to use 
their names for publicity for or to assert or imply endorsement of any Modified Version. 

5. COMBINING DOCUMENTS 

You may combine the Document with other documents released under this License, under 
the terms defined in section 4 above for modified versions, provided that you include in the 
combination all of the Invariant Sections of all of the original documents, unmodified, and list 
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them all as Invariant Sections of your combined work in its license notice, and that you preserve 
all their Warranty Disclaimers. 

The combined work need only contain one copy of this License, and multiple identical 
Invariant Sections may be replaced with a single copy. If there are multiple Invariant Sections 
with the same name but different contents, make the title of each such section unique by adding 
at the end of it, in parentheses, the name of the original author or publisher of that section if 
known, or else a unique number. Make the same adjustment to the section titles in the list of 
Invariant Sections in the license notice of the combined work. 

In the combination, you must combine any sections Entitled "History" in the various original 
documents, forming one section Entitled "History"; likewise combine any sections Entitled 
"Acknowledgements", and any sections Entitled "Dedications". You must delete all sections 
Entitled "Endorsements." 

6. COLLECTIONS OF DOCUMENTS 

You may make a collection consisting of the Document and other documents released under 
this License, and replace the individual copies of this License in the various documents with a 
single copy that is included in the collection, provided that you follow the rules of this License 
for verbatim copying of each of the documents in all other respects. 

You may extract a single document from such a collection, and distribute it individually 
under this License, provided you insert a copy of this License into the extracted document, and 
follow this License in all other respects regarding verbatim copying of that document. 

7. AGGREGATION WITH INDEPENDENT WORKS 

A compilation of the Document or its derivatives with other separate and independent 
documents or works, in or on a volume of a storage or distribution medium, is called an 
"aggregate" if the copyright resulting from the compilation is not used to limit the legal rights of 
the compilation's users beyond what the individual works permit. When the Document is 
included in an aggregate, this License does not apply to the other works in the aggregate which 
are not themselves derivative works of the Document. 

If the Cover Text requirement of section 3 is applicable to these copies of the Document, 
then if the Document is less than one half of the entire aggregate, the Document's Cover Texts 
may be placed on covers that bracket the Document within the aggregate, or the electronic 
equivalent of covers if the Document is in electronic form. Otherwise they must appear on 
printed covers that bracket the whole aggregate. 

8. TRANSLATION 

Translation is considered a kind of modification, so you may distribute translations of the 
Document under the terms of section 4. Replacing Invariant Sections with translations requires 
special permission from their copyright holders, but you may include translations of some or all 
Invariant Sections in addition to the original versions of these Invariant Sections. You may 
include a translation of this License, and all the license notices in the Document, and any 
Warranty Disclaimers, provided that you also include the original English version of this License 
and the original versions of those notices and disclaimers. In case of a disagreement between the 
translation and the original version of this License or a notice or disclaimer, the original version 
will prevail. 

If a section in the Document is Entitled "Acknowledgements", "Dedications", or "History", 
the requirement (section 4) to Preserve its Title (section 1) will typically require changing the 
actual title. 
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9. TERMINATION 

You may not copy, modify, sublicense, or distribute the Document except as expressly 
provided for under this License. Any other attempt to copy, modify, sublicense or distribute the 
Document is void, and will automatically terminate your rights under this License. However, 
parties who have received copies, or rights, from you under this License will not have their 
licenses terminated so long as such parties remain in full compliance. 

10. FUTURE REVISIONS OF THIS LICENSE 

The Free Software Foundation may publish new, revised versions of the GNU Free 
Documentation License from time to time. Such new versions will be similar in spirit to the 
present version, but may differ in detail to address new problems or concerns. See 
http://www.gnu.org/copyleft/. 

Each version of the License is given a distinguishing version number. If the Document 
specifies that a particular numbered version of this License "or any later version" applies to it, 
you have the option of following the terms and conditions either of that specified version or of 
any later version that has been published (not as a draft) by the Free Software Foundation. If the 
Document does not specify a version number of this License, you may choose any version ever 
published (not as a draft) by the Free Software Foundation. 
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